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Computerized  Analysis  of  MR  and  Ultrasound  Images  of  Breast  Lesions 
Maryellen  L.  Giger,  Ph.D. 


INTRODUCTION 

The  main  hypothesis  to  be  tested  is  that,  computerized  analysis  of  breast  ultrasound  and 
MR  images  should  yield  new  methods  for  distinguishing  between  malignant  and  benign  lesions 
and  thus,  reduce  the  number  of  unnecessary  biopsies.  In  addition,  even  higher  performance  is 
expected  when  a  combination  of  features  from  mammographic,  MR,  and  ultrasound  images  is 
used  as  an  aid  to  radiologists  in  the  task  of  distinguishing  between  malignant  and  benign 
lesions.  The  main  goal  of  the  proposed  research  is  to  develop  noninvasive,  computerized 
methods  for  analyzing  ultrasound  and  MR  (magnetic  resonance)  images  of  breast  lesions  to  aid 
radiologists  in  their  workup  of  suspect  lesions.  The  specific  objectives  of  the  research  to  be 
addressed  are:  1.  Create  a  database  of  ultrasound  and  MR  images  including  malignant  lesions, 
benign  solid  masses,  and  complex  cysts;  2.  Develop  noninvasive,  computerized  methods  for 
characterizing  the  lesions  to  yield  an  output  related  to  the  probability  of  malignancy;  and  3. 
Evaluate  the  efficacies  of  the  new  image  analysis  methods  in  the  task  of  distinguishing  between 
malignant  and  benign  lesions.  It  is  expected  that  the  results  from  this  research  will  aid 
radiologists  in  determining  the  likelihood  of  malignancy  and  in  reducing  the  number  of  benign 
cases  sent  to  biopsy.  Computerized  image  analysis  techniques  that  can  objectively  and  reliably 
classify  lesions  based  upon  reported  sonographic  and/or  MR  characteristics  of  benign  and 
malignant  masses,  especially  if  combined  with  their  mammographic  features,  could  significantly 
improve  the  specificity  of  breast  imaging  and  the  evaluation  of  breast  masses.  The  proposed 
work  is  novel  in  that  computer-aided  diagnosis  techniques  applied  to  gray-scale  sonographic 
images  has  not  yet  been  reported.  In  addition,  computerized  analysis  of  MR  images  of  the 
breast  has  mainly  been  limited  to  only  temporal  analysis  using  contrast  media. 


BODY 

Breast  cancer  is  a  leading  cause  of  death  in  women,  causing  an  estimated  46,000  deaths  per 
year  (1).  Mammography  is  the  most  effective  method  for  the  early  detection  of  breast  cancer, 
and  it  has  been  shown  that  periodic  screening  of  asymptomatic  women  does  reduce  mortality 
(2-4).  Many  breast  cancers  are  detected  and  referred  for  surgical  biopsy  on  the  basis  of  a 
radiographically  detected  mass  lesion  or  cluster  of  microcalcifications.  Although  general  rules 
for  the  differentiation  between  benign  and  malignant  mammographically  identified  breast 
lesions  exist  (5,  6),  considerable  misclassification  of  lesions  occurs  with  the  current  methods. 
On  average,  less  than  30%  of  masses  referred  for  surgical  breast  biopsy  are  actually  malignant 
(7). 


Breast  sonography  is  used  as  an  important  adjunct  to  diagnostic  mammography  and  is 
typically  performed  to  evaluate  palpable  and  mammographically  identified  masses  in  order  to 
determine  their  cystic  vs.  solid  natures.  The  accuracy  of  ultrasound  has  been  reported  to  be  96- 
100%  in  the  diagnosis  of  simple  benign  cysts  (8).  Masses  so  characterized  do  not  require 
further  evaluation;  however,  75%  of  masses  prove  to  be  indeterminate  or  solid  on  sonography 
and  are  candidates  for  further  intervention  (9).  With  the  advent  of  modem  high-frequency 
transducers  that  have  improved  spatial  and  contrast  resolution,  a  number  of  sonographic 
features  have  emerged  as  potential  indicators  of  malignancy,  while  other  features  are  typical  for 
benign  masses  (10,1 1).  Benign  features  include  hyperechogenicity,  ellipsoid  shape,  mild 
lobulation,  and  a  thin,  echogenic  pseudocapsule.  Malignant  features  include  spiculation,  angular 
margins,  marked  hypoechogenicity,  posterior  aeoustic  shadowing,  and  a  depthiwidth  ratio 
greater  than  0.8.  Recently,  Stavros,  et  al.,  used  these  and  other  features  to  characterize  masses 
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as  benign,  indeterminate,  and  malignant  (12).  Their  classification  scheme  had  a  sensitivity  of 
98.4%  and  a  negative  predictive  value  of  99.5%.  However,  the  sonographic  evaluation 
described  by  these  investigators  is  much  more  extensive  and  complex  than  is  usually  performed 
at  most  breast  imaging  centers. 

Breast  MR  imaging  as  an  adjunct  to  mammography  and  sonography  reveals  breast  cancer 
with  a  higher  sensitivity  than  do  mammography  and  sonography  only  (13).  However,  using  all 
three  methods  in  the  human  interpretation  process  yielded  a  lower  specificity.  It  also  has  been 
shown  that  temporal  analysis  from  dynamic  MR  correlates  with  intensity  of  fibrosis  in 
fibroadenomas  (14).  Some  computerized  analyses  of  spatial  features  are  being  performed. 
Adams  et  al.  achieved  a  separation  between  malignant  and  benign  lesions  using  a  statistical 
analysis,  however,  their  database  consisted  of  only  16  cases  (15). 

Computerized  image  analysis  techniques  that  can  objectively  and  reliably  classify  lesions 
based  upon  reported  sonographic  and/or  MR  characteristics  of  benign  and  malignant  masses, 
especially  if  combined  with  their  mammographic  features,  could  significantly  improve  the 
specificity  of  breast  imaging  and  the  evaluation  of  breast  masses.  Computer-aided  techniques 
have  been  applied  to  the  color  Doppler  evaluation  of  breast  masses  with  promising  results  (16). 
However,  color  Doppler  imaging  is  a  technique  which  focuses  only  upon  the  vascularity  of 
lesions.  Since  not  all  sonographically  visible  cancers  have  demonstrable  neovascularity,  this 
technique  is  inherently  somewhat  limited.  On  the  other  hand,  computer-aided  diagnosis 
techniques  applied  to  gray-scale  sonographic  images  has  not  yet  been  reported.  In  addition, 
computerized  analysis  of  MR  images  of  the  breast  has  mainly  been  limited  to  only  temporal 
analysis  using  contrast  media. 

Comprehensive  summaries  of  investigations  in  the  field  of  mammography  CAD  have  been 
published  by  the  co-P.I.  (17, 18).  In  the  1960s  and  70s,  several  investigators  attempted  to 
analyze  mammographic  abnormalities  with  computers.  These  previous  studies  demonstrated 
the  potential  capability  of  using  a  computer  in  the  detection  of  mammographic  abnormalities. 
Gale  et  al.  (19)  and  Getty  et  al.  (20)  are  both  developing  computer-based  classifiers,  which  take 
as  input  diagnostically-relevant  features  obtained  from  radiologists’  readings  of  breast  images. 
Getty  et  al.  found  that  with  the  aid  of  the  classifier,  community  radiologists  performed  as  well  as 
unaided  expert  mammographers  in  making  benign-malignant  decisions.  Swett  et  al.  (21)  are 
developing  an  expert  system  to  provide  visual  and  cognitive  feedback  to  the  radiologist  using  a 
critiquing  approach  combined  with  an  expert  system.  At  the  University  of  Chicago,  we  have 
shown  that  the  computerized  analysis  of  mass  lesions  (22)  and  clustered  microcalcifications 
(23)  on  digitized  mammograms  yields  performances  similar  to  an  expert  mammographer  and 
significantly  better  than  average  radiologists  in  the  task  of  distinguishing  between  malignant  and 
benign  lesions. 

The  proposed  work  is  novel  in  that  computer-aided  diagnosis  techniques  have  not  yet 
been  applied  to  gray-scale  breast  ultrasound  an^or  MR  images.  In  addition,  future  research 
involving  the  use  computers  to  merge  features  from  mammographic,  MR,  and  ultrasound 
images,  as  an  aid  to  radiologists,  has  not  yet  been  investigated. 

The  main  goal  of  the  proposed  research  is  to  develop  noninvasive,  computerized  methods 
for  analyzing  ultrasound  and  MR  (magnetic  resonance)  images  of  breast  lesions  to  aid 
radiologists  in  their  workup  of  suspect  lesions.  The  specific  objectives  of  the  research  to  be 
addressed  are:  1.  Create  a  database  of  ultrasound  and  MR  images  including  malignant  lesions, 
benign  solid  masses,  and  complex  cysts;  2.  Develop  noninvasive,  computerized  methods  for 
characterizing  the  lesions  to  yield  an  output  related  to  the  probability  of  malignancy;  and  3. 
Evaluate  the  efficacies  of  the  new  image  analysis  methods  in  the  task  of  distinguishing  between 
malignant  and  benign  lesions.  It  is  expected  that  the  results  from  this  research  will  aid 
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''  radiologists  in  determining  the  likelihood  of  malignancy  and  in  reducing  the  number  of  benign 
cases  sent  to  biopsy. 

1.  Establishment  of  a  database  of  ultrasound  and  MR  images 
Methods 

Approximately  500  sonographically  demonstrated  lesions  will  be  collected  which  will 
include  aspirated  complex  cysts,  and  biopsied  solid  benign  and  malignant  masses.  The 
database  of  these  collected  cases  will  include  the  MR,  sonographic,  and  mammographic  images 
as  well  as  the  lesions’  ultimate  dispositions  and  diagnoses.  (Note  that  funding  already  exists 
for  the  computerized  analysis  of  the  mammographic  lesions).  Based  upon  our  current  case 
load,  we  estimate  that  approximately  30%  of  the  lesions  will  be  complex  cysts  which  required 
aspiration  to  prove  their  cystic  nature,  40%  will  be  benign  solid  masses,  and  30%  will  be 
cancers.  Palpable  and  mammographically  identified  masses  are  evaluated  sonographically  by 
representative  images  in  orthogonal  planes,  obtaining  measurements  in  these  same  planes,  and 
most  masses  are  also  evaluated  with  color  Doppler  imaging.  Although  the  preliminary  studies 
on  ultrasound  images  involved  the  digitization  of  ultrasound  films,  the  ultrasound  images  in  this 
new  database  will  be  obtained  directly  from  an  ATL  ultrasound  machine,  which  produces  digital 
image  data.  In  addition,  approximately  50  cases  of  MR  images  of  the  breast  will  be  collected 
with  a  Tl-weighted  sequence,  using  coronal  slices.  After  injection  of  GD  contrast,  4  to  6  scans 
of  both  breasts  will  be  obtained  at  90  sec.  time  intervals.  Biospy  results  will  be  used  to 
determine  truth  regarding  malignancy. 

Results  to  Date 

We  currently  have  retrospectively  collected  over  400  ultrasound  cases  of  mass  lesions,  all 
that  had  gone  on  to  either  biopsy  or  cyst  aspiration.  The  images  are  obtained  from  University 
of  Chicago  and  Northwestern  University.  The  images  are  transferred  in  digital  format  from  the 
ATL  unit.  The  digital  images  within  the  ATL  unit  are  obtained  by  screen  capture.  For  each 
case  we  have  at  least  two  views  of  the  lesion.  We  are  currently  collecting  the  corresponding 
mammograms  for  the  study.  We  have  digitized  over  100  cases  with  2  to  7  films  per  case.  We 
expect  to  finish  the  database  by  the  end  of  summer  2001.  Approximately  150  cases  will  not  be 
digitized  due  to  the  case  either  having  a  non-mammographically  seen  lesion  or  a  lesion  which 
caused  a  call  back  for  ultrasound  but  did  not  correspond  to  the  lesion  interpreted  on  the 
ultrasound. 

We  currently  have  retrospectively  collected  35  coronal  MR  cases  from  University  of 
Muenster,  362  saggital  MR  cases  of  the  breast  from  University  of  Pennsylvania,  and  90  cases 
from  the  Unviersity  of  Berlin  (which  follow  a  protocol  similar  to  University  of  Muenster). 
These  are  all  volume  datasets.  Of  the  362  saggital  cases,  253  are  focal  (192  malignant,  51 
benign,  10  normal),  74  are  diffuse  lesions  (48  malignant  and  25  benign),  10  are  ductal  (9 
malignant  and  1  benign),  and  25  showed  no  enhancement  (3  malignant,  19  benign,  3  normal). 

2.  Development  of  computerized  method  for  the  classification  of  lesions 

The  computerized  method  will  include  the  image  analysis  of  the  texture  within  the  lesion, 
the  analysis  of  the  margin  of  the  lesion,  and  a  comparison  of  the  lesion  with  its  surrounding 
tissue.  Computerized  analysis  of  the  texture  pattern  in  the  lesion  will  be  based  on  various 
texture  analysis  methods  we  have  been  investigating  in  our  laboratory  including  Fourier  spectra 
analysis  and  artificial  neural  networks.  We  note  that  it  is  extremely  important  to  understand  the 
relationship  between  the  mathematical  texture  measures  and  the  physical  nature  of  the  breast 
parenchyma. 

The  computerized  analysis  of  the  margin  characteristics  (edge  definition)  will  involve 
feature  extraction  using  radial  edge-gradient  analysis.  We  have  done  similar  analysis  on 
radiographic  masses  in  determining  their  margin  characteristics  (spiculated  and  ill  defined)  (22). 
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Two  promising  measures  are  the  FWHM  and  the  average  radial  gradient  which  correspond  to 
the  degree  of  spiculation  and  how  ill-defined  is  the  margin,  respectively.  From  the  computer- 
extracted  margin,  we  will  also  determine  the  shape  and  irregularity  of  each  lesion. 

Specifically  for  the  ultrasound  images,  comparison  of  the  "density"  and  the  texture  patterns 
of  the  lesion  with  neighboring  regions,  including  those  deep  to  the  lesion,  will  be  performed  in 
order  to  quantify  its  echogenicity  and  the  amount  of  any  posterior  acoustic  shadowing  or 
enhancement.  This  will  be  performed  by  comparing  feature  values  "below"  the  lesion  to  those 
obtained  along  side  and  below  the  lesion. 

Temporal  features  will  be  determined  from  analyzing  the  MR  image  data  over  time.  The 
contrast  meduim  uptake  curve  will  be  analyzed  at  various  spatial  locations  within  and  around  the 
suspect  lesion.  Temporal  operators  include  the  maximum  uptake,  mean  gradient  of  uptake,  and 
rms  variation.  Both  two-dimensional  and  three-dimensional  features  will  be  calculated,  e.g., 
irregularity  and  margin  gradient  characteristics.  In  addition,  the  spatial  features  will  be 
investigated  as  a  function  of  time. 

We  plan  to  use  artificial  neural  networks  along  with  other  measures  of  the  mass  in  question 
to  obtain  an  estimate  of  the  likelihood  of  malignancy.  We  will  investigate  merging  the 
ultrasound  image  features  and  MR  features  with  those  from  mammographic  images  of  the  same 
lesion.  We  already  have  funding  support  for  the  investigation  involving  radiographic  imaging 
of  masses. 

The  various  features  will  serve  as  input  data  and  will  be  supplied  to  the  input  units  of  the 
artificial  neural  network.  Prior  to  input  to  the  ANN,  the  features  will  be  normalized  between  0 
and  1.  The  output  data  from  the  neural  network  are  then  obtained  through  successive  nonlinear 
calculations  in  the  hidden  and  output  layers.  The  calculation  at  each  unit  in  a  layer  includes  a 
weighted  summation  of  all  entry  numbers,  an  addition  of  a  certain  offset  number,  and  a 
conversion  into  a  number  ranging  from  0  to  1  using  a  sigmoid-shape  function  such  as  a 
logistic  function.  The  neural  network  will  be  trained  by  a  back-propagation  algorithm  using 
pairs  of  training  input  data  and  desired  output  data.  The  desired  output  data  will  be  initially  1  if 
features  of  a  malignant  lesion  are  input  and  0  otherwise.  Once  trained,  the  neural  network  will 
accept  features  of  a  lesion  and  will  output  a  value  that  will  be  related  to  a  likelihood  of 
malignancy.  Feature  selection  will  be  performed  by  analyzing  the  average  and  standard 
deviation  of  the  various  features  for  both  malignant  and  benign  lesions.  Az  values  will  be 
calculated  for  each  feature  as  well  as  for  the  output  of  the  ANNs.  In  addition,  genetic 
algorithms,  which  we  have  used,  in  a  pilot  study,  for  optimizing  feature  selection  for  the  task  of 
distinguishing  true-positive  and  false-positive  mass  detections,  will  also  be  used. 

Results  to  Date:  Ultrasound 

We  are  developing  computerized  analyses  of  breast  lesions  in  ultrasound  images  to  aid  in 
the  discrimination  between  malignant  and  benign  lesions  (24).  We  extracted  and  calculated 
features  related  to  lesion  margin,  shape,  homogeneity  (texture)  and  the  nature  of  the  posterior 
acoustic  attenuation  pattern  in  ultrasound  images  of  the  breast.  Our  database  contained  184 
digitized  ultrasound  images  from  58  patients  with  78  lesions.  Benign  lesions  were  confirmed 
by  biopsy,  eyst  aspiration,  or  image  interpretation  alone,  while  malignant  lesions  were  confirmed 
by  biopsy.  ROC  analysis  was  used  to  study  the  performance  of  the  various  individual  features 
and  the  output  from  linear  discriminant  analysis  in  distinguishing  benign  from  malignant 
lesions.  From  ROC  analysis,  the  feature  characterizing  the  margin  yielded  Az  values  of  0.85 
and  0.75,  in  the  task  of  distinguishing  between  benign  and  malignant  lesions  in  the  entire 
database  and  in  an  equivocal  database,  respectively.  The  "equivocal"  database  contained  lesions 
that  had  been  proven  to  be  benign  or  malignant  by  either  cyst  aspiration  or  biopsy.  Linear 
discriminant  analysis  round-robin  runs  yielded  Az  values  of  0.94  and  0.87  in  the  task  of 
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distinguishing  between  benign  and  malignant  lesions  in  the  entire  database  and  the  equivocal 
database,  respectively. 

We  are  currently  evaluating  the  method  on  a  database  of  ultrasound  images  from 
Northwestern  University.  The  database  of  over  400  cases  includes  pathology  truth  as  well  as 
radiologists  BI-RADS  ratings  with  all  cases  having  gone  to  biopsy  or  aspiration.  Our  previous 
method  required  radiologists'  manually-drawn  lesion  contours  as  input  to  the  computerized 
classification  scheme.  The  current  method,  however,  involves  automatic  segmentation  of  the 
lesion  contour  from  the  ultrasound  image  data.  Of  the  410  cases,  126  were  complex  cysts,  186 
were  benign  solid  lesions,  and  98  were  malignant  lesions.  Features  related  to  lesion  margin, 
shape,  echogenicity  (texture)  and  posterior  acoustic  attenuation  were  automatically  extracted. 

To  evaluate  the  performance  of  the  computer  alone,  the  entire  database  was  divided  into  training 
and  testing  groups.  The  independent  linear  discriminant  analysis  yielded  a  validation  result  of 
an  Az  of  0.89  and  a  partial  Az  value  at  0.90  sensitivity  of  0.52.  In  addition,  in  order  to  evaluate 
the  performance  of  the  computer  relative  to  that  of  the  radiologists,  125  cases  were  assessed  for 
suspicion  by  an  expert  sonographer.  Round-robin  analysis  in  the  task  of  distinguishing 
malignant  from  benign  lesions  yielded  Az  values  of  0.88  and  0.92  for  the  computer  and  the 
radiologist,  respectively. 

We  have  submitted  two  manuscripts  to  Medical  Physics  —  one  on  the  computerized 
segmentation  method  and  one  on  the  computer-extracted  ultrasound  features.  These  preprints 
are  in  the  appendix. 

Results  to  Date:  MRI 

We  are  developing  computerized  analyses  of  breast  lesions  in  MR  images  to  aid  in  the 
discrimination  between  malignant  and  benign  lesions  (25).  Dynamic  MR  data  was  obtained 
from  27  patients  by  a  Tl-weighted  sequence,  using  64  coronal  slices,  a  typical  slice  thickness  of 
2  mm,  and  a  pixel  size  of  1.25  mm.  After  injection  of  GDTPA  contrast,  4  to  6  scans  of  both 
breasts  were  obtained  at  90  sec.  time  intervals.  The  database  contained  13  benign  and  15 
malignant  lesions.  Our  computerized  classification  method  includes  temporal  features  of 
normalized  speed  and  inhomogeneity  of  uptake,  and  spatial  features  of  margin  descriptors  such 
as  circularity  and  irregularity.  Our  results  indicate  that  classification  based  on  temporal  and 
spatial  features  combined  can  yield  a  positive  predictive  value  of  94%,  and  has  the  potential  to 
reduce  the  number  of  unnecessary  biopsies  by  approximately  92%. 

We  have  developed  a  new  method  for  automatically  extracting  the  lesion  from  the  3D 
image  set  of  the  breast.  Our  previous  results  were  based  on  the  use  of  manually-drawn  lesion 
contours  in  the  various  slices  of  the  MR  data.  The  new  segmentation  method  involves  the  use 
of  an  encompassing  shell  to  limit  the  region  for  local  thresholding.  ROC  analysis  yielded  Az 
values  of  0.90  when  the  manual  segmentation  was  used  in  the  classification  and  0.93  when 
automatic  segmentation  was  included. 

We  are  currently  evaluating  the  method  on  362  cases  from  the  University  of  Pennsylvannia 
as  wellas  the  cases  from  the  University  of  Muenster  and  University  of  Berlin.  The  UPENN 
images  differ  from  our  initial  database  in  that  these  cases  are  saggital  and  had  fat  suppression 
applied.  Also,  the  UPENN  dataset  uses  fat  suppression  and  thus  a  modification  in  the 
automatic  lesion  extraction  method  was  made.  For  the  evaluation,  we  developed  a  new  interface 
for  the  human  delineation  of  the  lesion  margin  in  3D  to  serve  as  "margin  truth".  While 
outlining  the  margin  in  a  slice,  the  observer  can  see  their  outline  in  other  views.  One 
performance  of  index  is  an  overlap  calculation  in  which  ,  in  terms  of  voxels,  we  calculate  the 
intersection  of  the  human  and  computer  margins  divided  by  the  union.  We  now  have  this 
margin  truth  for  roughly  200  cases  and  we  are  now  running  the  overlap  comparison  to 
determine  if  the  computer  outlines  similar  to  the  human.  We  will  also  do  the  comparison  in 
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terms  of  the  performance  of  the  features  extracted  from  the  lesion  in  the  task  of  distinguishing 
malignant  and  benign  lesions. 


3.  Evaluation  in  the  task  of  distinguishing  between  malignant  and  benign  lesions. 

In  order  to  test  the  capability  of  the  neural  networks  to  learn  the  features  of  malignant  and 
benign  lesions,  a  consistency  test  will  be  conducted  in  which  the  network  is  first  trained  with  all 
the  cases  in  the  database  and  then  tested  with  the  same  cases  used  in  the  training.  A  consistency 
test  indicates  that  the  network  is  able  to  "remember"  all  of  the  input  types  that  were  used  for 
training.  However,  it  is  more  important  to  test  if  the  network  can  learn  a  generalized  set  of 
inputs  from  the  examples  provided  and  if  it  can  then  make  a  correct  prediction  for  new  cases 
that  were  not  included  in  the  training.  Thus,  a  round-robin  method  will  be  employed  to  test  the 
network's  generalizing  ability.  With  the  jack-knife  method,  all  but  one  of  the  cases  are  selected 
randomly  from  the  database  for  training  of  the  network,  and  the  remaining  one  case  is  used  for 
testing  the  network.  The  output  values  are  then  compared  to  the  "truth"  data.  Various 
combinations  of  training  and  testing  pairs  will  be  selected  by  using  a  random  number  generator 
and  the  results  will  be  analyzed  using  ROC  analysis.  ROC  curves  will  be  obtained  by  fitting 
continuous  output  data  from  the  neural  networks  using  the  LABROC4  program  (26).  The  area 
under  the  ROC  curve  (Az)  will  be  used  as  an  indicator  of  performance.  In  order  to  determine 
the  structure  of  the  neural  network  as  well  as  the  necessary  number  of  training  iterations,  we  will 
analyze  the  consistency  results  and  the  round-robin  results  in  terms  of  Az  as  a  function  of 
number  of  iterations,  momentum,  learning  rate  and  number  of  hidden  units.  We  use  Az  as  an 
indicator  of  performance  since  it  includes  information  on  both  the  sensitivity  and  specificity  of 
the  measures. 

The  proposed  techniques  are  expected  to  yield  measures  about  the  likelihood  of 
malignancy.  Receiver  Operating  Characteristic  (ROC)  analysis  (26)  will  be  employed  in 
evaluating  the  performance  of  the  measures.  We  have  used  ROC  analysis  successfully  in  both 
evaluting  the  performance  of  human  observers  as  well  as  that  of  computerized  schemes.  The 
task  in  which  the  image  features  will  be  evaluated  will  be  in  their  ability  to  determine  an  estimate 
of  the  likelihood  of  malignancy.  The  decision  variable  for  the  ROC  analysis  will  be  each 
individual  feature  as  well  as  combined  measures  within  a  modality  and  combined  measures 
from  multiple  modalities  (x-ray,  MR,  and  ultrasound). 

We  expect  that  500  lesions  and  their  ultrasound  images  will  be  available  for  testing.  Note 
that  here  the  measure  of  performance  will  be  the  Az  value  (from  ROC  analysis)  obtained  in  the 
task  of  distinguishing  between  malignant  and  benign  lesions.  To  obtain  an  estimate  of  the 
number  of  lesions  needed  for  adequate  statistical  power  in  testing  differences  in  Az  values,  we 
assume  only  a  correlation  of  0.60  between  the  estimates  of  Az  that  are  found  for  our  current 
method  involving  the  computerized  analysis  of  mammographic  lesions  (Az=0.87)  and  that  for 
the  expected  improved  method  (Az=0.92).  With  Npos  patients  who  have  a  malignant  lesion  and 
Nneg  patients  who  have  a  benign  lesion,  the  standard  error  of  the  resulting  estimate  can  be 
approximated  (Eqn.  9  in  Ref.  27)  by  the  expression  {[2Az-(l-f)(l-Az)](l-Az)/3Npos]}*^^, 
where  f  represents  the  ratio  Npos/Nneg-  Thus,  with  f  =  1,  the  statistical  power  at  a  critical 
significance  level  of  a  =  0.05  for  500  mass  lesions  is  94%. 

Results  to  Date 

The  results  from  the  evaluation  of  the  methods  is  described  in  the  preliminary  studies 
described  above.  We  have  submitted  two  manuscripts  to  Medical  Physics  —  one  on  the 
computerized  segmentation  method  and  one  on  the  computer-extracted  ultrasound  features. 
These  preprints  are  in  the  appendix.  We  also  presented  preliminary  results  on  the  combination 
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of  mammographic  and  sonographic  features  on  the  common  database  as  SPIE  Medical  Imaging 

2001. 

KEY  RESEARCH  ACCOMPLISHMENTS 

•  Development  of  an  automatic  method  for  segmenting  lesions  on  ultrasound. 

•  Development  of  robust  features  for  characterizing  lesions  in  ultrasound  images  of  the 
breast. 

•  Development  of  robust  features  for  characterizing  lesions  in  MRI  images  of  the  breast. 

•  Investigation  and  development  of  methods  for  segmentation  in  2D  for  ultrasound  images 
and  in  2D  and  3D  for  MRI  image  datasets. 

•  Preliminary  investigation  of  merging  mammographic  and  sonographic  features  of  lesions 
on  a  common  database. 


REPORTABLE  OUTCOMES 

1 .  Gilhuijs  KGA,  Giger  ML,  Bick  U:  Automated  analysis  of  breast  lesions  in  three 
dimensions  using  dynamic  magnetic  resonance  imaging.  Medical  Physics  25:1647- 
1654, 1998. 

2.  Giger  ML,  Al-Hallaq  H,  Huo  A,  Moran  C,  Wolverton  DE,  Chan  CW,  Zhong  W: 
Computerized  analysis  of  lesions  in  ultrasound  images  of  the  breast.  Academic 
Radiology  6:  665-674,  1999.  (also  being  reprinted  in  the  Yearbook  of  Radiology) 

3.  Horsch  K,  Giger  ML,  Venta  LA,  Huo  Z,  Vybomy  CJ;  Computer-aided  diagnosis  of 
breast  lesions  on  ultrasound.  Proceedings,  International  Workshop  on  Digital 
Mammography.  Toronto,  Canada,  June,  2000. 

4.  Horsch  K,  Giger  ML,  Venta  LA,  Vybomy  CJ:  Automatic  segmentation  of  breast 
lesions  on  ultrasound.  Medical  Physics  (in  press). 

5.  Horsch  K,  Giger  ML,  Venta  LA,  Vybomy  CJ:  Computerized  diagnosis  of  breast 
lesions  on  ultrasound.  Medical  Physics  (accepted  with  revision). 

6.  Giger  ML,  Huo  Z,  Horsch  K,  Hendrick  E,  Venta  L,  Vybomy  CJ:  Computer-aided 
diagnosis  of  lesions  on  multimodality  images  of  the  breast.  Proc.  SPIE  2001  (in  press). 

CONCLUSIONS 

We  have  made  great  strides  in  the  development  of  methods  for  the  claslsification  of  lesions 
in  ultrasound  and  MR  images  of  the  breast.  We  are  retrospectively  collecting  large  datasets  of 
ultrasound  and  MR  cases  with  solid  pathology  tmth  and  radiologists'  ratings.  These  cases 
include  malignant  lesions,  benign  solid  masses,  and  complex  cysts.  We  are  developing 
noninvasive,  computerized  methods  for  characterizing  the  lesions  to  yield  an  output  related  to 
the  probability  of  malignancy  and  plan  to  evaluate  the  efficacies  of  the  new  image  analysis 


10 


methods  in  the  task  of  distinguishing  between  malignant  and  benign  lesions.  It  is  expected  that 
the  results  from  this  research  will  aid  radiologists  in  determining  the  likelihood  of  malignancy 
and  in  reducing  the  number  of  benign  cases  sent  to  biopsy. 
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Abstract 

We  present  a  computer-aided  diagnosis  (CAD)  method  for  breast  lesions  on  ultrasound 
that  is  based  on  the  automatic  segmentation  of  lesions  and  the  automatic  extraction  of 
four  features  related  to  the  lesion  shape,  margin,  texture  and  posterior  acoustic  behavior. 
Using  a  database  of  400  cases  (94  malignant  lesions,  124  complex  cysts  and  182  benign  solid 
lesions),  we  investigate  the  marginal  benefit  of  each  feature  in  our  CAD  method  and  the 
performance  of  our  CAD  method  in  distinguishing  malignant  lesions  from  various  classes  of 
benign  lesions.  Finally,  independent  validation  is  performed  on  our  CAD  method.  Eleven 
independent  trials  yielded  an  average  value  of  0.87  in  the  task  of  distinguishing  malignant 
from  benign  lesions. 


Keywords:  Breast  sonography,  computer-aided  diagnosis. 
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1  Introduction 

Although  ultrasound  is  currently  used  to  diagnose  simple  cysts  with  a  reported  accuracy 
of  96-100%  [1],  it  is  not  currently  used  to  differentiate  solid  lesions  by  most  radiologists. 
Biopsy  or  aspiration  is  the  usual  management  for  those  lesions  that  are  not  diagnosed  as 
clearly  cystic  during  the  ultrasound  exam.  In  general,  for  masses  undergoing  surgical  biospy, 
only  10  to  31%  are  actually  cancerous.  As  biopsy  is  associated  with  greater  cost  and  patient 
anxiety,  researchers  are  exploring  the  diagnostic  capability  of  breast  sonography  in  differen¬ 
tiating  malignant  from  benign  solid  masses.  In  a  recent  study,  Stavros  el  al  [2]  developed  a 
classification  scheme,  using  various  human-extracted  sonographic  features,  that  achieved  a 
sensitivity  of  98.4%  and  a  negative  predictive  value  of  99.5%  on  a  data  set  of  750  solid  breast 
masses. 

Computer-aided  diagnosis  (CAD)  in  breast  ultrasound  is  being  explored  by  various  re¬ 
searchers.  Giger  et  al.  have  developed  a  computer-aided  diagnosis  scheme  that  uses  clinically- 
motivated,  computer-extracted  sonographic  features  to  quantify  breast  lesion  shape,  mar¬ 
gin,  texture  and  posterior  acoustic  behavior  [3].  Other  researchers  have  concentrated  on 
computer-extracted  texture  features  [4],  [5]  or  RF  signal  characteristics  [6].  Sahiner  et  el  [7] 
has  explored  computerized  characterization  of  breast  masses  using  texture  features  extracted 
from  three-dimensional  ultrasound  images. 

We  present  a  CAD  method  for  breast  lesions  on  ultrasound  that  performs  automatic 
feature  extraction  on  automatically-segmented  lesions.  The  computer-extracted  features  are 
then  merged  through  linear  discriminant  analysis.  Three  studies  were  performed  on  a  large 
clinical  database  of  400  cases:  1)  evaluation  of  the  marginal  benefit  of  each  feature  to  our 
CAD  method,  2)  determination  of  the  performance  of  our  CAD  method  in  distinguishing 
carcinomas  from  different  types  of  benign  lesions,  and  3)  independent  validation  of  the 
method  using  11  independent  trials. 
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2  Material  and  Methods 

2.1  Database 

The  database  in  our  study  consists  of  400  consecutive  ultrasound  examinations  acquired  dur¬ 
ing  diagnostic  breast  evaluations  at  4he  Lynn  Sage  Breast  Center  of  Northwestern  Memorial 
Hospital,  in  which  lesions  were  detected  and  either  biopsied  or  aspirated.  Of  the  400  cases, 
94  were  malignant  solid  lesions,  124  were  complex  cysts,  107  were  benign  tumors  (fibroade¬ 
nomas  and  papillomas),  65  were  fibrocystic  disease,  and  10  had  other  benign  causes.  The 
database  contains  no  simple  cysts.  The  number  of  each  lesion  type  and  information  on  their 
size  is  listed  in  Table  1.  Examples  of  a  malignant  mass,  a  complex  cyst,  fibrocystic  disease 
and  a  fibroadenoma  are  shown  in  Figure  1.  The  757  images  in  our  study  were  obtained  with 
an  ATL  3000  unit  using  a  5  MHz  transducer  and  were  captured  directly  from  the  8-bit  video 
signal.  The  number  of  images  per  cases  varied  from  one  to  six. 

2.2  Notation 

In  what  follows,  the  image  gray  level  data  is  denoted  by  I{m,  n)  where  m  =  0, 1,  •  •  • ,  M/  —  1 
and  n  =  0, 1,  •  •  • ,  A/  —  1.  Here,  Mj  is  the  number  of  pixels  in  the  lateral  direction  of  the 
image  and  Nj  is  the  number  of  pixels  in  the  depth  direction  of  the  image.  The  gradient 
image  is  denoted  by  V/  and  is  computed  using  Sobel  filters.  The  gray  level  data  of  a 
subimage,  or  region  of  interest  (ROI),  is  denoted  by  R{m,  n)  where  m  =  0, 1,  •  •  • ,  Mr  —  1 
and  n  =  0, 1,  •  •  • ,  Nr  —  1.  Again,  Mr  is  the  number  of  pixels  in  the  lateral  direction  of  the 
ROI  and  Nr  is  the  number  of  pixels  in  the  depth  direction  of  the  ROI.  The  points  on  the 
lesion  margin  have  x  and  y  coordinates  (71  (j),  72 (i))  where  the  index  j  =  0, 1,  •  •  • ,  J  —  1  and 
J  is  the  number  of  points  in  the  margin.  We  will  also  require  a  vector  r{m,  n)  of  unit  length 
in  the  radial  direction  from  the  geometric  center  of  the  lesion  to  the  point  indexed  by  (m,  n). 
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where  L{m,  n)  is  the  lesion  mask,  a  binary  image  having  value  1  within  the  image  and  0 
elsewhere.  A  is  the  area  of  the  lesion. 


2.3  Lesion  Segmentation 

For  each  image,  lesions  were  segmented  both  manually  and  automatically  from  the  normal 
breast  tissue. 

Manual  segmentation  involved  displaying  each  ultrasound  image  on  an  IBM  PowerDis- 
play20  monitor,  and  having  a  mammographer  or  medical  physicist  outline  the  lesion  margin. 
Of  the  757  images,  360  were  outlined  by  a  mammographer  and  397  were  outlined  by  a  med¬ 
ical  physicist.  In  another  paper  [8],  we  performed  an  analysis  to  compare  the  lesion  margins 
of  a  medical  physicist  with  the  lesion  margins  of  an  experienced  mammographer.  Each  of 
113  images  was  outlined  by  two  mammographers  and  a  medical  physicist,  and  the  overlap  [8] 
between  the  first  and  second  mammographers  was  compared  to  the  overlap  between  the  first 
mammographer  and  the  medical  physicist.  In  this  way,  we  demonstrated  that  the  medical 
physicist  performs  similarly  to  an  experienced  mammographer  in  the  task  of  outlining  lesion 
margins  on  ultrasound  images. 

Our  automatic  lesion  segmentation  algorithm  involves  the  following  steps  [8]:  (1)  pre¬ 
processing  by  cropping  and  median  filtering  the  image,  (2)  multiplication  with  a  Gaussian 
constraint  function,  (3)  determination  of  potential  lesion  margins  through  gray-value  thresh¬ 
olding,  and  (4)  maximization  of  a  utility  function  for  the  potential  lesion  margins.  The 
Gaussian  constraint  function  is  centered  at  the  manually  defined  lesion  center,  which  is  de¬ 
termined  by  computing  the  geometric  center  of  the  manually  segmented  lesion  margins.  (For 
description  of  how  the  geometric  center  is  computed,  see  Section  2.2.) 


Submitted  to  Medical  Physics: 


6 


We  use  a  utility  function  called  the  Average  Radial  Derivative  (ARD),  which  gives  the 
average  directional  derivative  in  the  radial  direction 

=  lYl  V7(7i(i),72(i))  •  r(7i(i),72(i)) ,  (2) 

^  i=o 

where,  as  defined  in  Section  2.2,  I  is  the  image  gray  level  data,  V/  is  the  gradient  image, 

V 

(71572)  is  the  discretized  lesion  margin,  J  is  the  number  of  points  in  the  discretized  margin, 
and  r(7i,  72)  is  the  unit  vector  in  the  radial  direction  from  the  geometric  center  of  the  lesion 
to  the  point  (71,72)-  Note  that  the  center  of  the  lesion  is  the  only  information  defined 
manually  for  the  automatic  segmentation  algorithm. 


2,4  Automatic  Feature  Extraction 

In  the  clinical  evaluation  of  breast  lesions  on  ultrasound,  radiologists  consider  features  that 
include  lesion  shape,  margin  definition,  echogenic  texture,  posterior  acoustic  enhancement 
or  shadowing  [9].  Benign  lesions  tend  to  demonstrate  a  lesion  shape  that  is  wider  than  taller, 
well-defined,  smooth  margins,  and  posterior  acoustic  enhancement.  Benign  solid  lesions  tend 
to  be  hyperechoic  while  benign  cysts  tend  to  be  anechoic.  Malignant  lesions,  on  the  other 
hand,  tend  to  be  taller  than  wider  with  ill-defined,  angular  margins  while  also  manifesting 
hypoechogenicity,  and  posterior  acoustic  shadowing.  We  will  consider  computer-extracted 
features  which  quantify  these  clinically-used  features. 

The  shape  feature  that  we  consider  is  the  depth-to-width  ratio  (DWR)  of  the  lesion, 
which  is  defined  by 


DWR  = 


Depth 

Width 


niaxj(72(j))  -  minj(72(i)) 
maxj(7i(i))  -  minj(7i(i))  ’ 


(3) 


where  j  =  0, 1,  •  •  • ,  J  —  1.  (See  Section  2.2  for  the  definition  of  71,  72  and  J.)  Cysts  and 
benign  solids  tend  to  be  wider  than  deep  and  thus,  benign  lesions  tend  to  yield  smaller  values 
for  the  DWR  then  malignant  lesions. 
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The  margin  feature  that  we  consider  is  the  normalized  radial  gradient  (NRG),  which  is  a 
measure  of  the  average  orientation  of  the  gray  level  gradients  along  the  margin.  It  is  given 
by  [10,  11] 


^  ^-^(7i(i),72(i))  •  r(7i(i),72(j)) 

E/=oM|V/(7i(i),72(j))ll 


(4) 


(See  Section  2.2  for  the  definition  of  V/,  r,  71,  72  and  J.  In  general,  the  NRG  varies  between 
-1  and  1,  being  near  1  when  the  gradients  tend  on  average  to  point  radially  outward  and  near 
-1  when  the  gradients  tend  on  average  to  point  radially  inward.  For  ultrasound,  however,  the 
NRG  tends  to  be  greater  than  zero.  This  is  because  almost  all  US  lesions  of  significance  tend 
to  be  darker  (i.e.  less  echogenic)  than  the  surrounding  tissue,  and  therefore,  the  gradients 
along  the  lesion  margin  tend,  on  average,  to  point  outward  toward  increasing  gray-level 
values.  The  benign  lesions  tend  to  yield  larger  values  of  the  NRG.  Observe  that  the  NRG 
contains  no  information  about  the  magnitude  of  the  gradient  along  the  margin. 

To  quantify  texture,  the  autocorrelation  in  depth  of  R,  the  gray  level  values  in  the  minimal 
rectangular  ROI  containing  the  lesion,  is  used  to  define 


COR  = 


Nr-\ 

E 


n-=0 


Cy{n) 

CyiO)  ’ 


(5) 


where  the  autocorrelation  in  depth  and  its  sum  in  the  lateral  direction  are 

Njt  —  l—n 

Cy{m,n)  =  E  RH  m,n  +  p)i?^(m,p) , 

p=0 

_  Mr-\ 

Cyip)  =  E  Cy{m,n). 

m=0 

A  picture  of  the  minimal  rectangular  ROI  for  an  example  lesion  is  shown  in  Figure  2.  Observe 
that  the  COR  is  a  sum  and  not  an  average.  Thus,  COR  includes  not  only  texture  information, 
but  also  size  information. 

Posterior  acoustic  behavior  is  quantified  by  comparing  the  gray-level  values  posterior  to 
the  lesion  to  those  in  adjacent  tissue  at  the  same  depth.  This  comparisons  is  accomplished 
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by  considering  differences  in  the  average  gray  level  values  of  the  appropriate  region  of  interest 
(ROI).  To  avoid  edge  shadows,  define  an  ROI  which  is  the  the  lesion  itself  minus  a  portion 
of  its  lateral  sides.  This  is  done  to  avoid  edge  shadows.  The  left,  post,  and  right  ROIs 
are  rectangular  with  the  same  width  and  area  as  the  ROI  which  is  the  lesion  itself  minus  a 
portion  of  its  lateral  sides.  These  ROIs  are  shown  schematically  in  Figure  3.  The  posterior 
acoustic  behavior  feature  considered  is;,the  minimum  side  difference  (MSD).  To  understand 
why  the  minimum  is  chosen,  observe  that  tl^e  difference  in  the  average  gray-level  posterior  to 
the  lesion  and  that  in  adjacent  tissue  at  the  same  depth  tends  to  be  negative  for  malignant 
lesions  because  of  posterior  acoustic  shadowing.  Choosing  the  minimum  thus  errs  on  the 
side  of  malignancy.  The  posterior  acoustic  behavior  feature  is  defined  as 

MSD  —  min  {^Apost  Ai^fi  ,  Apo^i  ,  (6) 

where  Aposu  Auft  and  Aright  is  the  average  gray- level  value  over  the  appropriate  ROI. 

The  above  features  are  computed  for  each  image  in  both  the  manually-segmented  and 
computer-segmented  approaches.  A  particular  feature  value  for  a  given  lesion  and  segmenta¬ 
tion  is  the  average  of  that  feature  over  all  the  views  available  for  the  lesion,  with  each  lesion 
being  represented  by  one  to  six  images. 

The  computer-extracted  features  of  shape,  margin,  texture  and  posterior  acoustic  behav¬ 
ior  are  then  merged  through  linear  discriminant  analysis  (LDA)  [12]  for  automatic  classifi¬ 
cation. 

2.5  Evaluation 

In  order  to  investigate  the  marginal  difference  of  adding  a  feature  to  the  LDA,  combinations 
of  two  and  three  features  are  merged  in  addition  to  merging  all  four  features.  In  the  case  of 
merging  all  four  features,  both  consistency  and  round  robin  evaluations  are  performed  in  the 
task  of  distinguishing  malignant  and  benign  lesions.  In  a  consistency  LDA  [12],  each  case  is 
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classified  according  to  a  classifier  trained  with  all  of  the  cases.  In  a  round  robin  LDA  [12], 
one  of  the  cases  is  removed  from  the  data  and  that  case  is  classified  according  to  a  classifier 
trained  with  the  remaining  cases.  This  process  is  then  repeated  for  each  lesion.  Since  we  are 
using  a  linear  classifier,  we  do  not  expect  much  change  in  the  performance  of  the  consistency 
and  round  robin  evaluations,  and  indeed,  we  found  this  to  be  the  case. 

It  is  of  interest  to  compare  how  opr  CAD  method  performs  in  differentiating  carcinomas 
from  the  different  types  of  benign  lesion^.  In  particular,  we  use  the  two,  three  and  four- 
feature  LDA  classifiers  to  differentiate  malignant  lesions  from  benign  lesions  for  the  following 
data  subsets:  (A)  the  entire  database,  (B)  carcinomas  and  benign  solid  lesions  (all  benign 
cases  except  for  complex  cysts),  (C)  carcinomas  and  complex  cysts,  (D)  carcinomas  and 
benign  tumors  (fibroadenomas  and  papillomas),  (E)  carcinomas,  complex  cysts  and  benign 
tumors,  and  (F)  carcinomas  and  fibrocystic  disease.  Data  subset  (B)  is  important  as  there  is 
considerable  clinical  importance  in  the  differentiation  of  malignant  and  solid  benign  lesions. 
The  reason  for  considering  the  data  subset  (E)  is  that  first,  complex  cysts  and  benign  tumors 
are  the  most  represented  benign  lesion  types  in  our  database.  The  second  reason  is  that 
complex  cysts  and  benign  tumors  tend  to  have  well-defined  margins,  and  are  thus  more 
easily  differentiated  from  other  types  of  lesions  by  our  CAD  method.  Data  subset  (F)  is 
interesting  because  many  of  the  cases  of  fibrocystic  disease  are  difficult  for  radiologists  to 
see  on  our  images,  much  less  to  diagnose. 

For  the  two,  three  and  four-feature  classifiers,  the  LDA  was  trained  on  the  entire  database 
and  then  tested  on  the  data  subset  containing  only  the  particular  class  of  benign  lesions  of 
interest.  We  emphasize  that  for  a  given  set  of  features,  the  classifier  is  not  retrained  for  each 
data  subset,  but  rather  the  same  classifier  (trained  on  the  entire  database)  is  used  for  each  of 
the  different  data  subsets.  For  the  round  robin  evaluation  of  the  four-feature  classifier,  one 
of  the  cases  from  a  particular  data  subset  is  removed  and  that  case  is  classified  according  to 
the  classifier  trained  with  the  remaining  cases  from  the  entire  database.  This  process  is  then 
repeated  for  each  case  in  the  particular  data  subset. 
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Finally,  independent  validation  was  also  performed  on  our  CAD  method.  We  randomly 
selected  half  the  cysts,  half  the  benign  solid  lesions  and  half  the  malignant  lesions  to  be  our 
training  database  of  200  cases.  The  remaining  200  cases  formed  the  validation  database  (see 
Figure  4).  The  random  selection  was  performed  11  times.  For  each  of  the  11  randomizations, 
LDA  was  used  to  merge  the  four  computer-extracted  features  in  the  training  database,  and 
the  resulting  classifiers  were  evaluated  on  the  validation  database. 

Receiver  operating  characteristic  (ROC)  analysis  [13]  was  used  to  evaluate  (by  case,  not 
by  image)  the  performance  of  the  individual  computer-extracted  features  and  the  various 
LDA  classifiers  in  the  task  of  distinguishing  malignant  lesions  from  various  classes  of  benign 
lesions.  The  area  under  the  ROC  curve,  or  value,  and  the  partial  area  at  0.90  sensitivity, 
or  0.9^2 value  [14],  were  used  as  indicators  of  merit. 

3  Results  and  Discussion 

Figures  5  and  6  show  scatter  plots  of  the  four  computer-extracted  features  derived  from  the 
automatically-defined  lesion  margins  for  the  entire  database  (400  cases).  As  anticipated, 
malignant  lesions  tend  to  demonstrate  a  larger  depth-to-width  ratio,  a  smaller  normalized 
radial  gradient  value  and  a  more  negative  minimum  side  difference  than  benign  lesions. 
The  autocorrelation  based  feature  demonstrates  more  overlap  in  the  values  of  benign  and 
malignant  cases. 

Combinations  of  two,  three  and  four  of  the  computer-extracted  features  were  merged 
with  LDA  using  the  entire  database  and  the  performance  of  the  resulting  classifiers  tested 
on  each  data  subset  using  ROC  analysis.  The  and  partial  o.gA^ values,  as  well  as  the 
standard  estimated  deviations  on  the  A^  values  (computed  by  LABROC4  [15]),  for  each  of 
the  individual  computer-extracted  features  are  shown  in  Tables  2,3,  and  4  for  data  subsets  A 
and  B,  data  subsets  C  and  D,  and  data  subsets  E  and  F,  respectively.  Also  shown  are  the  Az 
and  partial  o.gA^  values  for  the  combination  of  all  four  computer-extracted  features,  for  both 
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the  consistency  and  round  robin  evaluations.  When  tested  on  the  entire  database,  the  best 
performing  combination  of  two  computer-extracted  features  was  the  depth-to-width  ratio 
and  the  normalized  radial  gradient.  The  best  performing  combination  of  three  computer- 
extracted  features  was  the  depth-to-width  ratio,  the  normalized  radial  gradient  and  the 
autocorrelation  feature.  In  Tables  2,  3,  and  4,  we  report  only  the  performance  of  these 
strongest  two  and  three  feature  classifiers. 

Considering  the  round  robin  evaluation  of  all  four  features,  we  see  that  the  classifier  has 
the  best  performance  when  differentiating  malignant  lesions  from  complex  cysts,  with 
and  0.9^2 values  of  0.95  and  0.78,  respectively,  in  the  case  of  manual  segmentation,  and 
and  o.9>IzValues  of  0.94  and  0.71,  respectively,  in  the  case  of  automatic  segmentation.  The 
worst  performance  was  demonstrated  by  the  classifier  when  differentiating  malignant  lesions 
from  fibrocystic  disease  with  A2  and  o.9v42 values  of  0.80  and  0.37,  respectively,  in  the  case  of 
manual  segmentation,  and  A^  and  o.g^^values  of  0.70  and  0.19,  respectively,  in  the  case  of 
automatic  segmentation.  ROC  curves  of  the  round  robin  evaluations  of  the  LDA  using  all 
four  features  for  data  subsets  A,  B  and  C  are  shown  in  Figure  7. 

It  should  be  noted  that  for  a  given  segmentation  method  and  benign  class,  the  per¬ 
formances  of  the  strongest  two-feature,  strongest  three-feature  and  four-features  classifiers 
are  fairly  similar.  To  determine  whether  the  differences  in  performance  are  statistically 
significant,  univariate  z-score  tests  of  the  differences  in  the  Az  and  partial  Az  values  were 
performed  using  the  program  CLABROC  [16].  Using  a  p- value  less  than  0.05  as  the  cut¬ 
off,  we  failed  to  show  a  statistically  significant  difference  between  the  performances  of  the 
strongest  three-feature  and  four-feature  classifiers  in  differentiating  carcinomas  from  any 
class  of  benign  lesions,  using  either  manual  and  automatic  segmentation.  This  indicates  that 
for  our  database,  the  posterior  acoustic  behavior  feature,  MSD,  does  not  add  significantly  to 
the  performance  of  the  four-feature  LDA.  The  correlation  coefl&cients  between  the  individual 
features  were  all  less  than  0.42,  with  the  smallest  correlation  coefficient  being  —0.06  between 
the  depth-to-width  ratio  and  the  normalized  radial  gradient. 


Submitted  to  Medical  Physics: 


12 


When  comparing  the  performances  of  the  strongest  two-feature  classifier  to  the  strongest 
three-feature  classifier,  a  statistically  significant  difference  in  values  was  found  in  three 
situations.  The  first  is  that  of  using  lesion  margins  delineated  through  automatic  segmen¬ 
tation  and  testing  the  LDA  on  the  entire  database.  Here  the  Az  values  of  the  strongest 
two-feature  and  three-feature  classifiers  are  0.87  and  0.88,  respectively,  with  a  p-value  of 
0.03.  The  second  is  that  of  using  lesion  margins  delineated  through  manual  segmentation 
and  using  the  LDA  classifiers  to  differentiate  malignant  lesions  from  complex  cysts.  Here 
the  Az  values  of  the  strongest  two-feature  and  three-feature  classifiers  are  0.93  ±  0.02  and 
0.95  ±  0.01,  respectively,  with  a  p-value  of  0.001.  The  third  is  that  of  using  lesion  margins 
delineated  through  manual  segmentation  and  using  the  LDA  classifiers  to  differentiate  malig¬ 
nant  from  complex  cysts  and  benign  tumors.  Here  the  Az  values  of  the  strongest  two-feature 
and  three-feature  classifiers  are  0.93  ±  0.02  and  0.95  ±  0.01,  respectively,  with  a  p-value  of 
0.02. 

Independent  validation  was  performed  11  times  by  splitting  the  entire  database  randomly 
into  two  equal  parts,  as  schematically  shown  in  Figure  4.  For  each  of  the  11  independent  tri¬ 
als,  LDA  was  used  to  determine  two  classifiers:  one  by  merging  the  four  computer-extracted 
features  derived  from  manually-defined  lesions  margins  and  the  other  by  merging  the  four 
computer-extracted  features  derived  from  the  automatically-defined  lesion  margins.  Then, 
for  each  of  the  11  independent  trials,  these  two  classifiers  were  tested  on  the  validation 
database,  again  using  features  derived  from  both  manually-defined  and  automatically-defined 
lesion  margins.  Table  5  lists  the  low,  high  and  average  Az  and  values  resulting  from 
ROC  analysis  of  each  of  the  11  independent  trials  in  the  task  of  distinguishing  malignant 
from  benign  lesions.  Shown  in  Figure  8,  for  the  case  of  using  manually  segmented  lesions 
for  both  training  and  validation,  are  the  average  Az  values  for  the  first  n  trials,  where 
n  =  2, 3,  •  •  • ,  11.  Also  shown  are  the  error  bars,  which  are  the  Az  values  plus  and  minus  one 
standard  deviation.  As  Figure  8  demonstrates,  the  Az  values  plateau  after  about  8  random¬ 
izations,  indicating  that  11  randomizations  are  sufficient.  Shown  in  Figure  9  are  the  average 
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ROC  curves  for  each  training/validation  pair.  The  average  ROC  curves  were  obtained  by 
averaging  the  a  and  b  ROC  curve  parameters  [17]. 

4  Summary 

In  summary,  we  have  developed  a  QAD  method  for  the  classification  breast  lesions  on  ul¬ 
trasound  and  performed  three  studies  on  ,a  database  of  400  cases.  First,  to  investigate  the 
marginal  benefit  of  adding  a  feature  to  our  CAD  method,  LDA  was  used  to  merge  combina¬ 
tions  of  two,  three  and  four  of  the  computer-extracted  features.  In  the  task  of  distinguishing 
malignant  from  benign  lesions,  the  best  two  feature  classifier  merges  the  depth-to-width  and 
normalized  radial  gradient  features  to  yield  an  value  of  0.90  using  manual  segmentation 
and  an  values  of  0.88  using  automatic  segmentation.  At  a  p-value  cutoff  of  0.05,  we  fail 
to  show  a  statistically  significant  difference  between  the  best  two-feature  classifier  and  the 
four-feature  classifier. 

Second,  the  performance  of  our  CAD  method  in  distinguishing  carcinomas  from  different 
types  of  benign  lesions  was  determined.  Our  CAD  method  yielded  the  best  performance 
in  distinguishing  carcinomas  from  complex  cysts  (Aj  =  0.95,  round  robin  evaluation  using 
automatic  segmentation)  and  the  worst  performance  in  distinguishing  carcinomas  from  fibro¬ 
cystic  disease  (A^  =  0.70,  round  robin  evaluation,  using  automatic  segmentation).  The  four- 
feature  classifier  using  automatically-delineated  lesion  margins  yielded  a  high  performance 
in  the  task  of  distinguishing  carcinomas  from  complex  cysts  and  benign  tumors  (A^  =  0.92, 
round  robin  evaluation). 

Finally,  11  independent  trials  were  performed  on  the  entire  database  to  obtain  valida¬ 
tion  results.  Using  computer-extracted  features  derived  from  automatically-delineated  lesion 
margins  for  both  the  training  and  validation,  a  mean  A^  of  0.87  ±  0.02  was  obtained. 

The  results  of  this  study  warrant  further  investigation  and  in  the  future,  an  observer 
study  will  be  performed  to  evaluate  the  potential  of  our  CAD  method  in  improving  physician 
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s, 

performance  in  the  task  of  differentiating  malignant  from  benign  breast  lesions  on  ultrasound. 
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Table  1;  The  number  of  each  lesion  type,  as  well  as  size  information. 


Lesion  Type 

Number 
of  Cases 

Minimum 

Size 

Maximum 

Size 

Average 

Size 

Carcinomas 

94 

4  mm 

37  mm 

11  mm 

Complex  Cysts 

124 

4  mm 

28  mm 

9  mm 

Benign  Solid  Lesions 

(All  benign  cases  except  complex  cysts) 

182 

3  mm 

28  mm 

10  mm 

Benign  Tumors 

(Fibroadenomas  and  Papillomas) 

107 

4  mm 

26  mm 

10  mm 

Fibroadenomas 

100 

4  mm 

26  mm 

10  mm 

Papillomas 

7 

3  mm 

19  mm 

11  mm 

Fibrocystic  Disease 

65 

3  mm 

23  mm 

10  mm 

Inflammation 

2 

7  mm 

8  mm 

7  mm 

Infection 

1 

9  mm 

9  mm 

9  mm 

No  Abnormality 

3 

5  mm 

14  mm 

11  mm 

Radial  Scar 

3 

14  mm 

28  mm 

19  mm 

Intramammary  Lymph  Node 

1 

5  mm 

5  mm 

5  mm 
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Data  Subset  A 

Malignant  Lesions  (94)  vs. 

All  Benign  Lesions  (306) 

Data  Subset  B 

Malignant  Lesions  (94)  vs. 

Benign  Solid  Lesions  (182) 

Manual 

Segmentation 

Automatic 

Segmentation 

Manual 

Segmentation 

Automatic 

Segmentation 

Analysis 

Az  (<t) 

o.qAz 

A.  (a) 

0.9  Az 

Az  (cr) 

o.sAz 

Az  (cr) 

o.gAz 

DWR 

0.84  (0.02) 

0.41 

0.82  (0.02) 

0.40 

0.85  (0.02) 

0.48 

0.81  (0.03) 

0.41 

NRG 

0.76  (0.03) 

0.27 

0.75  (0.03) 

0.34 

0.70  (0.03) 

0.19 

0.68  (0.03) 

0.21 

COR 

0.81  (9,03) 

0.42 

0.70  (0.03) 

0.22 

0.75  (0.03) 

0.30 

0.62  (0.03) 

0.15 

MSD 

0.73  (0.02) 

0.20 

0.74  (0.03) 

0.24 

0.68  (0.03) 

0.14 

0.68  (0.03) 

0.16 

DWR  and  NRG 

0.90  (0.02) 

0.55 

0.88  (0.02) 

0.55 

0.88  (0.02) 

0.52 

0.83  (0.02) 

0.45 

DWR,  NRG  and  COR 

0.91  (0.02) 

0.62 

0.87  (0.02) 

0.54 

0.89  (0.02) 

0.57 

0.83  (0.02) 

0.40 

All  four  features 

0.91  (0.02) 

0.63 

0.88  (0.02) 

0.53 

0.89  (0.02) 

0.56 

0.83  (0.02) 

0.42 

Round  Robin: 

All  four  features 

0.91  (0.02) 

0.61 

0.87  (0.02) 

0.51 

0.88  (0.02) 

0.53 

0.82  (0.02) 

0.40 

Table  2:  Performance  in  terms  of  Az  and  o.g^zvalues  of  the  LDA  for  combinations  of  the 
individual  computer-extracted  features  for  both  manual  and  automatic  segmentation.  The 
standard  deviations  on  the  Az  values  are  given  in  parentheses.  The  LDA  classifiers  were 
tested  in  differentiating  malignant  lesions  from  all  benign  lesions,  and  in  differentiating 
malignant  lesions  from  benign  solid  lesions  (all  benign  cases  except  the  complex  cysts). 


Submitted  to  Medical  Physics 


18 


Data  Subset  C 

Malignant  Lesions  (94)  vs. 

Complex  Cysts  (124) 

Data  Subset  D 

Malignant  Lesions  (94)  vs. 

Benign  Tumors  (107) 

Manual 

Segmentation 

Automatic 

Segmentation 

Manual 

Segmentation 

Automatic 

Segmentation 

Analysis 

A.  (ct) 

o.gAj 

Aj  (<r) 

o.gAj 

A,  (cr) 

o.gAj 

Az  (it) 

o.gAz 

DWR 

0.83  (0.03) 

0.33 

0.85  (0.03) 

0.29 

0.91  (0.02) 

0.59 

0.87  (0.02) 

0.46 

NRG 

0.86  (0.02) 

0.39 

0.85  (0.02) 

0.29 

0.75  (0.03) 

0.24 

0.73  (0.03) 

0.23 

COR 

0.91  (0.02) 

0.62 

0.81  (0.03) 

0.30 

0.78  (0.03) 

0.34 

0.63  (0.04) 

0.13 

MSD 

0.81  (0.03) 

_ 1 

0.30 

0.82  (0.03) 

0.36 

0.73  (0.03) 

0.19 

0.73  (0.03) 

0.13 

DWR  and  NRG 

0.56 

0.94  (0.02) 

0.63 

0.94  (0.01) 

0.66 

0.91  (0.02) 

0.51 

DWR,  NRG  and  COR 

0.95  (0.01) 

0.71 

0.93  (0.02) 

0.61 

0.94  (0.01) 

0.69 

0.91  (0.02) 

0.50 

All  four  features 

0.95  (0.01) 

0.72 

0.94  (0.01) 

0.64 

0.94  (0.01) 

0.69 

0.91  (0.02) 

0.47 

Round  Robin; 

All  four  features  | 

0.95  (0.01) 

0.70 

0.93  (0.02) 

0.61 

0.94  (0.01) 

0.66 

0.90  (0.02) 

0.44 

Table  3:  Performance  in  terms  of  and  0.9^2 values  of  the  LDA  for  combinations  of  the 
individual  computer-extracted  features  for  both  manual  and  automatic  segmentation.  The 
standard  deviations  on  the  values  are  given  in  parentheses.  The  LDA  classifier  were  tested 
in  differentiating  malignant  lesions  from  complex  cysts,  and  in  differentiating  malignant 
lesions  from  benign  tumors. 
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Data  Subset  R 

Malignant  Lesions  (94)  vs. 

Complex  Cysts  and 

Benign  Tumors  (231) 

Data  Subset  F 

Malignant  Lesions  (94)  vs. 

Fibrocystic  Disease  (65) 

Manual 

Segmentation 

Automatic 

Segmentation 

Manual 

Segmentation 

Automatic 

Segmentation 

Analysis 

A,  (<7) 

Az  (<t) 

0.9A2 

A* 

o.qAj 

A.  (a) 

0.9A2 

DWR 

0.86  (0.02) 

0.44 

0.86  (0.02) 

0.38 

0.78  (0.04) 

0.33 

0.72  (0.04) 

0.28 

NRG 

0.81  (0.03) 

0.33 

0.79  (0.03) 

0.39 

0.63  (0.04) 

0.13 

0.62  (0.04) 

0.16 

COR 

0.85  (0.02) 

0.49 

0.73  (0.03) 

0.23 

0.72  (0.04) 

0.27 

0.61  (0.04) 

0.17 

MSD 

0.77  (0.03) 

0.25 

0.78  (0.03) 

0.28 

0.61  (0.04) 

0.08 

0.60  (0.05) 

0.08 

DWR  and  NRG 

0.93  (0.01) 

0.61 

0.93  (0.02) 

0.59 

0.80  (0.04) 

0.32 

0.74  (0.04) 

0.29 

DWR,  NRG  and  COR 

0.95  (0.01) 

0.70 

0.92  (0.02) 

0.57 

0.81  (0.03) 

0.38 

0.74  (0.04) 

0.28 

All  four  features 

0.95  (0.01) 

0.70 

0.93  (0.02) 

0.56 

0.81  (0.03) 

0.38 

0.73  (0.04) 

0.26 

Round  Robin: 

All  four  features 

0.94  (0.01) 

0.69 

0.92  (0.02) 

0.39 

0.80  (0.04) 

0.36 

0.72  (0.04) 

0.25 

,  Table  4;  Performance  in  terms  of  and  o.9>lz values  of  the  LDA  for  combinations  of  the 
individual  computer-extracted  features  for  both  manual  and  automatic  segmentation.  The 
standard  deviations  on  the  Az  values  are  given  in  parentheses.  The  LDA  classifiers  were 
tested  in  differentiating  malignant  lesions  from  complex  cysts  and  benign  tumors,  and  in 
differentiating  malignant  lesions  from  fibrocystic  disease. 
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Segmentation 
Used  for 

Training 

Segmentation 
Used  for 

Validation 

Low 

High 

Average 

A,  [a) 

O.qAz 

Az  {(t) 

o.^Az 

Az  (o-) 

O.qAz 

Manual 

Manual 

0.87 

0.50 

0.94 

0.72 

0.91  ±  0.02 

0.62  ±  0.07 

Manual 

Automatic 

0.85 

0.51 

0.90 

0.70 

0.87  ±  0.02 

0.60  ±  0.05 

Automatic 

Manual 

0.82 

0.28 

0.93 

0.62 

0.88  ±  0.04 

0.47  ±  0.15 

Automatic 

Automatic 

0.82 

0.36 

0.92 

0.70 

0.87  ±  0.02 

0.52  ±  0.11 

I 


Table  5:  Low,  high  and  average  and  o.g^zvalues  of  the  LDA  for  the  11  independent  trials. 


Figure  1:  Examples  of  (a)  a  malignant  lesion,  (b)  a  complex  cyst,  (c)  fibrocystic  disease  and 
(d)  a  fibroadenoma.  The  manually-delineated  margin  is  given  in  gray  and  the  computer- 
delineated  margin  in  white. 
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Figure  2:  The  ROI  used  to  define  the  autocorrelation  feature.  The  lesion  is  outlined  with  a 
solid  line  and  the  ROI  is  outlined  with  a  dashed  line. 
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Figure  3;  ROIs  used  to  define  the  posterior  acoustic  behavior  feature. 
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Training  Database  (200  Cases) 


Figure  4:  Random  selection  splitting  the  entire  database  in  two:  the  training  and  validation 
databases. 


Normalized  Radial  Gradient 


Figure  5:  The  scatter  plots  indicate  values  for  the  depth-to- width  and  normalized  radial 
gradient  features  for  the  entire  database.  Margins  defined  via  automatic  segmentation  were 
used. 


Maximum  Side  Difference 
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0.5  2.5  4.5 

Autocorrelation  Feature 


Figure  6:  The  scatter  plots  indicate  values  for  the  auto-correlation  based  feature  and  the 
minimum  side  difference  for  the  entire  database.  Margins  defined  via  automatic  segmentation 
were  used. 
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False-Positive  Fraction 

Figure  7:  ROC  curves  of  round  robin  evaluations  for  data  subset  A  (the  entire  database), 
data  subset  B  (carcinomas  and  benign  solid  lesions)  and  data  subset  C  (carcinomas  and 
complex  cysts).  Margins  defined  via  automatic  segmentation  were  used. 


True-Positive  Fraction 
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Figure  9:  Average  ROC  curves  for  the  11  independent  trials. 


Automatic  Segmentation  of  Breast  Lesions 

on  Ultrasound 
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Abstract 

This  paper  presents  a  simple  and  computationally-efficient  segmentation  algorithm  for 
breast  masses  on  sonography  that  is  based  on  maximizing  a  utility  function  over  partition 
margins  defined  through  gray-value  thresholding  of  a  preprocessed  image.  The  performance 
of  the  segmentation  algorithm  is  evaluated  on  a  database  of  400  cases  in  two  ways.  Of  the  400 
cases,  124  were  complex  cysts,  182  were  benign  solid  lesions  and  94  were  malignant  lesions.  In 
the  first  evaluation,  the  computer-delineated  margins  were  compared  to  manually-delineated 
margins.  At  an  overlap  threshold  of  0.40,  the  segmentation  algorithm  correctly  delineated 
94%  of  the  lesions.  In  the  second  evaluation,  the  performance  of  our  computer-aided  diagnosis 
method  on  the  computer-delineated  margins  was  compared  to  the  performance  of  our  method 
on  the  manually-delineated  margins.  Round  robin  evaluation  yielded  values  of  0.90  and 
0.87  on  the  manually-delineated  margins  and  the  computer-delineated  margins,  respectively, 
in  the  task  of  distinguishing  between  malignant  and  non-malignant  lesions. 
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•  1  Introduction 

Ultrasound  is  currently  used  to  diagnosis  simple  cysts  of  the  breast  with  a  reported  accuracy 
of  96-100%  [1].  However,  due  to  the  large  overlap  in  the  sonographic  appearance  of  malig¬ 
nant  and  benign  solid  lesions,  most  radiologists  feel  uncomfortable  relying  on  ultrasound  to 
differentiate  solid  masses.  This  results  in  the  utilization  of  biopsy  procedures  for  most  solid 
breast  lesions  interpreted  with  sonography.  In  a  recent  study,  Stavros  el  al  [2]  developed 
a  classification  scheme  which  used  various  sonographic  features  identified  by  radiologists  to 
achieve  a  sensitivity  of  98.4%  and  a  negative  predictive  value  of  99.5%  on  a  data  set  of  750 
solid  breast  masses.  The  use  of  specific  sonographic  feature  hold  the  potential  for  accurate 
classification  of  solid  breast  masses  using  ultrasound,  thereby  allowing  a  decrease  in  the  num¬ 
ber  of  biopsies  performed  for  benign  solid  lesions.  The  identification  of  sonographic  features 
can  also  be  potentially  automated. 

Computer-aided  diagnosis  (CAD)  methods  on  breast  ultrasound  are  being  explored  by 

various  researchers  [3,  4,  5,  6,  7].  Lesion  segmentation  is  often  an  important  step  in  computer- 

aided  diagnosis  schemes.  In  this  paper,  we  propose  a  simple  and  computationally  efficient 

segmentation  algorithm  for  breast  sonography  that  is  based  on  maximizing  a  utility  function 

over  partition  margins  defined  through  gray-value  thresholding  of  a  preprocessed  image.  The 

key  step  in  the  image  processing  involves  multiplication  by  a  constraint  function  whose  level 

surfaces  are  ellipses.  When  gray- value  thresholding  is  applied  to  an  image  so  processed,  the 

result  is  potential  lesion  margins  (partition  margins)  that  are  deformations  of  ellipses,  or 

{ 

“lesion-like” .  A  gradient-based  utility  function  is  then  used  to  choose  the  lesion  margin  from 
the  potential  margins.  The  performance  of  this  algorithm  is  evaluated  on  a  large  database  in 
two  ways:  (1)  by  comparing  computer-delineated  margins  to  manually-delineated  margins, 
and  (2)  by  comparing  the  performance  of  our  CAD  scheme  on  the  computer-delineated 
margins  to  the  performance  of  our  CAD  scheme  on  the  manually-delineated  margins. 
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2  Material  and  Methods 

2.1  Database 

Our  database  consists  of  400  consecutive  ultrasound  cases,  being  represented  by  757  images. 
These  images  were  acquired  during  diagnostic  breast  exams  at  the  Lynn  Sage  Breast  Center 
of  Northwestern  Memorial  Hospital.  The  cases  were  collected  retrospectively  and  all  had 
been  either  biopsied  or  aspirated.  Of  the  400  cases,  124  were  complex  cysts,  182  were  benign 
solid  lesions  and  94  were  malignant  solid  lesions.  Note  that  the  database  does  not  contain 
any  simple  cysts.  The  images  were  obtained  with  an  ATL  3000  unit  and  were  captured 
directly  from  the  8-bit  video  signal.  The  number  of  images  per  cases  varied  from  one  to  six. 
Size  information  for  each  of  the  lesion  types  is  given  in  Table  1. 

Table  1:  Size  information  for  complex  cysts,  benign  solid  lesions  and  malignant  lesions. 


Lesion  Type 

Minimum  Size 

Maximum  Size 

Average  Size 

Complex  Cysts 

4  mm 

28  mm 

9  mm 

Benign  Solid  Lesions 

3  mm 

28  mm 

10  mm 

Malignant  Lesions 

4  mm 

37  mm 

11  mm 

2.2  Lesion  Segmentation 

In  each  image,  lesions  were  both  manually  and  automatically  segmented  from  normal  breast 
tissue. 

Manual  segmentation  involved  displaying  each  ultrasound  image  on  an  IBM  PowerDis- 
play20  monitor  (Armonk,  NY),  and  having  a  mammographer  or  medical  physicist  outline 
the  lesion  margin  using  software  designed  for  that  purpose.  The  geometric  centers  of  these 
manually  outlined  lesions  are  then  used  as  input  to  the  automatic  segmentation  algorithm. 
The  automatic  lesion  segmentation  algorithm  involves  (1)  preprocessing  by  cropping  and 
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median  filtering,  (2)  multiplication  with  a  Gaussian  constraint  function,  (3)  determination  of 
potential  lesion  margins  through  gray-value  thresholding,  and  (4)  maximization  of  a  utility 
function  on  the  potential  lesion  margins. 

The  segmentation  method  begins  with  preprocessing  the  image.  The  subcutaneous  fat 
is  removed  by  cropping  the  top  of  the  image  by  35  pixels.  We  found  that  for  this  database, 
cropping  in  this  manner  was  sufficient  for  our  purpose.  In  the  future,  instead  of  cropping 
each  image  by  a  fixed  number  of  pixels,  the  edge  of  the  subcutaneous  fat  could  be  detected 
in  each  image  and  used  to  estimate  the  appropriate  number  of  pixels  to  crop.  After  removing 
the  subcutaneous  fat,  a  10  by  10  median  filter  is  used  to  suppress  the  ultrasound  speckle. 
An  example  of  a  preprocessed  image  is  shown  in  Figure  lb. 

The  next  step  involves  multiplying  by  a  constraint  function  centered  on  the  lesion  center. 
Kupinski  and  Giger  used  this  method  for  lesions  in  mammography  with  the  effect  of  suppress¬ 
ing  distant  pixel  values  and  encouraging  potential  lesion  margins  to  be  more  “lesion-like” 
[8].  A  similar  technique  may  be  applied  to  ultrasound  images  by  inverting  the  gray-scale  of 
the  preprocessed  image  before  multiplying  by  a  constraint  function.  If  C  is  the  constraint 
function,  then  the  resulting  image  is 


J{P)  =  C{P)  * 


fl _ 

\  maXp(/(P))/ 


(1) 


where  P  is  the  pixel  location.  Inverting  the  image  changes  the  lesion  from  dark  (low  gray 
values)  to  light  (high  gray  values).  The  constraint  function  should  have  higher  gray  values 
in  the  region  of  the  lesion  and  gray  values  near  zero  far  ,from  the  lesion.  An  example  of  an 
inverted  image  is  shown  in  Figure  Ic.  In  this  study,  a  Gaussian  was  used  as  the  constraint 
function.  The  Gaussian  is  centered  at  the  manually  defined  lesion  center,  /i: 


C{P)  =  N{P-,fi,a) 


exp  (  -  |(P  -  A)^^  PP  -  A)) 
27r\/det  K 


(2) 


Horsch,  Submitted  to  Medical  Physics. 


5 


'  Here  the  covariance  matrix  is  assumed  diagonal, 


where  and  are  the  variances  in  the  lateral  and  depth  directions,  respectively.  These 
variances  are  chosen  as 

—  —  CT  —  ^ 

C'z  2  '  y  2  ’ 

with  w  being  the  estimated  lesion  width  and  h  being  the  estimated  lesion  height  (or  depth). 
An  example  of  the  preprocessed  image  multiplied  by  a  Gaussian  constraint  function  is  shown 
in  Figure  Id. 

In  order  to  study  the  sensitivity  of  the  segmentation  algorithm  on  the  choice  of  variance, 
both  manual  and  automatic  width  and  height  estimation  were  performed.  In  this  paper, 
the  segmentation  algorithms  using  manually  and  automatically-estimated  lesion  width  and 
height  are  referred  to  as  partially-automatic  and  fully-automatic,  respectfully. 

In  the  partially-automatic  segmentation,  manual  estimation  of  the  lesion  width  and  height 
is  achieved  using  the  manually-delineated  lesion  margin.  If  7(z)  =  (7i(i),  72(*))  is  a  discrete 
parameterization  of  the  manually-delineated  margin  with  71  and  72  being  the  coordinates  in 
the  lateral  and  depth  directions,  respectively,  then  we  define 


'^manual 

=  max(7i(z))  -  min(7i(z)) , 

(5) 

^manual 

=  max(72(z))  -  min(72(?)) . 

1  1 

(6) 

In  the  fully-automatic  lesion  segmentation,  estimations  of  the  lesion  width  and  height  are 
determined  through  Sobel  edge  detection.  The  Sobel  filtered  images  are  defined  by 
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Figure  1:  The  results  of  the  segmentation  processing  steps  on  an  example  image:  (a)  the 
original  image,  (b)  the  preprocessed  image  (cropped  and  median  filtered),  (c)  the  inverted 
preprocessed  image,  (d)  the  inverted  preprocessed  image  multiplied  by  a  Gaussian,  (e)  the 
partions  resulting  from  gray-value  thresholding,  and  (f)  the  ARD  as  a  function  of  partition 
number.  In  (f),  the  smaller  the  partition  number,  the  smaller  the  area  enclosed  by  the 
partition.  For  this  particular  example,  the  computer-chosen  partition  is  number  35  and  is 
shown  as  a  dashed  line  on  the  image  in  (e). 


where  I  is  the  preprocessed  image,  *  is  the  convolution  operator,  and  and  Fy  are  3  by  3 
Sobel  filters  in  the  lateral  and  depth  directions,  respectively. 
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Estimations  of  the  location  of  the  lesion  edge  along  horizontal  and  vertical  lines  through  the 
lesion  center  are  given  by 
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Figure  2:  Automatic  estimations  of  edge  points.  The  manually-delineated  lesion  is  shown 
for  reference. 


Xo  = 

Xi  = 

2/0  = 

2/1  = 

An  example  of  these  locations  is  shown  in  Figure  2.  The  estimated  location  of  lesion  edges 
are  then  used  to  estimate  the  lesion  width  and  height  by 

'^automatic  ~  2  *  min(p.2;  Xq,  X-y  j  (8) 

^automatic  ~  2  *  min(/Xy  yo,  yi  /iy) .  (9) 

Note  that  for  the  width,  instead  of  using  the  length  between  the  left  and  right  edges,  we  use 
twice  the  minimum  of  the  lengths  between  the  lesion  center  and  the  left  and  right  edges. 
This  is  done  to  avoid  the  overestimation  which  may  result  when  distant  pixels  are  mistaken 


argfmin  4(i,/Xj,))  , 
arg(  max  4(i,/i2/)) , 
arg(  min  4(/z^,0), 
arg(  max  4(//x,0)- 
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f  » 

for  the  lesion  edge.  Similar  comments  apply  to  the  automatic  lesion  height  estimation.  The 
lesion  segmentation  which  results  from  using  such  estimations  will  err  on  the  side  of  “under 
growing”  rather  than  “over  growing” . 

When  Wautomatic  and  hautomatic  are  used  in  equation  (2),  the  lesion  center  is  the  only 
information  defined  manually  that  is  needed  by  the  segmentation  algorithm. 

We  emphasize  that  the  variances  in  the  width  and  depth  directions  for  the  Gaussian 
constraint  function  are  varied  adaptively  and  automatically  for  each  image.  This  differs 
from  the  study  done  by  Kupinski  and  Giger,  in  which  a  single  variance  is  used. 

After  applying  the  Gaussian  constraint  function  to  the  inverted  preprocessed  image, 
gray-value  thresholding  defines  partitions  whose  margins  are  potential  lesion  margins.  The 
potential  margin  that  maximizes  the  utility  function  on  the  preprocessed  image  then  defines 
the  lesion  margin.  The  utility  function  used  in  our  segmentation  algorithm  is  the  Average 
Radial  Derivative  (ARD),  which  gives  the  average  directional  derivative  in  the  radial  direction 
along  the  margin, 

ARD(V)  =  ^HP)  ■  HP) .  (10) 

where  P  is  the  discretized  potential  lesion  margin,  N  is  the  number  of  points  in  P,  r(P) 
is  the  unit  vector  in  the  radial  direction  from  the  geometric  center  of  the  partition  to  the 
point  P  =  {x,y),  and  •  is  the  dot  product  between  vectors.  An  example  of  potential  lesion 
margins  resulting  from  gray-value  thresholding  and  an  example  of  the  ARD  as  a  function 
of  partition  number  are  shown  in  Figures  le  and  If.  Note  that  this  utility  function  differs 
from  that  used  by  Kupinski  and  Giger  for  mammographic  lesions.  Their  technique,  based  on 
a  utility  function  called  the  Normalized  Radial  Gradient,  evaluates  the  average  orientation 
of  the  gray  level  gradients  along  the  margin  [8].  The  Normalized  Radial  Gradient  is  used 
elsewhere  in  this  paper  (see  Equation  12). 

Manual,  partially-automatic  and  fully-automatic  segmentation  were  performed  on  each 
ultrasound  image  in  the  database.  Examples  of  each  type  of  segmentation  are  shown  in 
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Figure  3. 

We  observe  that  on  average,  both  the  partially-automatic  and  the  fully-automatic  seg¬ 
mentation  algorithms  tend  to  result  in  smaller  lesions  than  those  defined  manually.  One 
reason  for  this  may  be  that  radiologists  seem  to  “overdraw”  lesion  margins.  This  remark 
is  based  on  observing  many  radiologists  outline  lesions.  In  addition,  the  lesions  segmented 
by  the  fully-automatic  algorithm  tend  to  be  smaller  than  those  segmented  by  the  partially- 
automatic  algorithm.  This  is  in  part  because  the  lesion  height  and  width  estimations  for 
fully-automatic  segmentation  tend  to  be  less  than  those  for  partially-automatic  segmenta¬ 
tion.  The  fully-automatic  width  estimation  is  twice  the  minimum  of  the  lengths  from  the 
lesion  center  to  the  left  and  right  lesion  edges  (see  Equation  9).  The  partially-automatic 
width  estimation  is  the  maximum  horizontal  length  in  the  manually  outlined  margin  (see 
Equation  6).  Similar  definitions  apply  to  the  height  estimations. 

2.3  Performance  Evaluation 

The  performance  of  the  segmentation  algorithm  can  be  assessed  by  comparing  the  computer- 
delineated  outlines  against  the  the  outlines  drawn  by  human  observers.  For  a  particular 
lesion,  the  overlap,  O,  between  the  computer-segmentation  and  the  manual-segmentation  is 
given  by 

Area(At  n  C) 

Area(At  U  C)  ’ 

where  M.  is  the  set  of  points  in  the  manually-segmented  lesion  and  C  the  set  of  points  in  the 
computer-segmented  lesion  (either  partially  or  fully-autopiatic).  The  overlap  ranges  between 
zero  and  one,  being  zero  in  the  case  of  no  overlap  and  one  in  the  case  of  exact  overlap.  To 
study  the  overlap  for  the  entire  database,  overlap  thresholds  are  set.  At  each  threshold,  the 
number  of  lesions  ’’correctly”  segmented  is  given  by  the  number  of  lesions  with  O  greater 
than  the  threshold. 

Ultimately,  we  are  concerned  with  computer-aided  diagnosis.  Therefore,  the  performance 
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Image  Partially-Automatic 


Fully-Automatic 


Figure  3;  Examples  of  both  the  partially  and  fully-automatic  segmentation  results.  The 
manually-delineated  margin  is  given  in  gray  and  the  computer-delineated  margin  in  white. 
A  complex  cyst  (a),  a  benign  solid  (b)  and  a  malignant  solid  (c). 
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’  of  the  segmentation  algorithm  can  be  measured  through  the  performance  of  our  automatic 
classifier.  The  classifier  uses  linear  discriminant  analysis  (LDA)  to  merge  four  computer- 
extracted  features.  The  classifier  and  the  extracted  features  are  described  in  the  next  section 
and  the  appendix.  Receiver  operating  characteristic  (ROC)  analysis  [9]  was  used  to  evaluate 
(by  case,  not  by  image)  the  performance  of  the  individual  computer-extracted  features  and 
the  LDA  classifier  in  the  task  of  distinguishing  benign  from  malignant  lesions.  The  Az  [9] 
and  partial  Az  values  [10]  are  used  as  indicators  of  merit.  The  Az  value  is  the  area  under 
the  ROC  curve  and  the  partial  Az  value  is  the  area  under  the  ROC  curve  but  above  the  0.90 
sensitivity  line  [9],  [10]. 

2.4  Automatic  Feature  Extraction 

Features  that  radiologists  use  clinically  in  the  evaluation  of  breast  masses  on  sonograms 
include  margin  definition,  echogenic  texture,  posterior  acoustic  enhancement  or  shadowing 
and  lesion  shape  [11].  Benign  lesions  tend  to  demonstrate  well-defined,  smooth  margins, 
posterior  acoustic  enhancement,  and  a  lesion  shape  that  is  wider  rather  than  taller.  Benign 
solid  lesions  can  be  hypoechoic  or  hyperechoic.  Malignant  lesions,  on  the  other  hand,  tend 
to  demonstrate  ill-defined,  angular  and  irregular  margins,  marked  hypoechogenicity,  and 
posterior  acoustic  shadowing. 

Four  characteristics  were  studied  here:  margin,  echogenicity,  posterior  acoustic  behavior 
and  shape.  These  were  automatically  quantified  using  the  normalized  radial  gradient  [12, 13], 
the  autocorrelation,  a  comparison  of  gray  levels  and  the  depth-to-width  ratio.  These  are 
briefiy  described  in  the  appendix  and  discussed  in  detail  elsewhere  [14]. 

The  computer-extracted  features  were  computed  for  each  image  and  for  each  of  the  seg¬ 
mentation  methods  described  earlier:  manual,  partially-automatic  and  fully-automatic.  A 
particular  feature  value  for  a  given  lesion  (case)  and  segmentation  was  taken  to  be  the 
average  of  that  feature  over  all  the  views  available  for  the  lesion,  each  lesion  being  repre¬ 
sented  by  one  to  six  images.  Linear  discriminant  analysis  (LDA)  [15]  was  used  to  merge  the 
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computer-extracted  margin,  echogenicity,  posterior  acoustic  behavior  and  shape  features. 
Both  consistency  and  round-robin  runs  were  performed.  In  a  consistency  LDA  [15],  each 
lesion  is  classified  according  to  a  classifier  trained  with  all  of  the  lesions.  In  a  round  robin 
LDA  [15],  one  of  the  lesions  is  removed  from  the  data  and  that  lesion  is  classified  according 
to  a  classifier  trained  with  the  remaining  lesions.  This  process  is  then  repeated  for  each 
lesion. 
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Figure  4:  The  overlap  results  for  the  partially  and  fully-automatic  segmentation  on  the  entire 
database  (757  images). 


3  Results  and  Discussion 


3.1  Segmentation  Overlap 


The  overlap  results  for  the  entire  database  are  shown  in  Figure  4.  The  overlap  of  the 
lesions  defined  through  partially-automatic  segmentation  was  slightly  greater  than  the  over- 
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Figure  5:  The  overlap  results  on  113  images  for  the  segmentation  methods  in  addition  to 
the  overlap  results  of  another  radiologist’s  outlines  and  a  medical  physicist’s  outlines. 

lap  of  those  defined  through  fully-automatic  segmentation.  This  is  not  surprising  as  the 
partially-automatic  segmentation  algorithm  uses  height  and  width  estimations  derived  from 
the  manual  margin  outlines  and  so  their  overlap  with  the  manually  segmented  lesions  should 
be  greater.  At  a  overlap  threshold  of  0.4,  the  fraction  of  images  “correctly”  segmented  is 
0.97  for  the  partially-automatic  method,  and  0.94  for  the  fully-automatic  method.  The  mean 
overlap  for  the  partially-automatic  segmentation  method  is  0.77  and  for  the  fully-automatic 
segmentation  method  is  0.73.  A  paired  t-test  for  the  overlap  measures  yields  a  p-value  of 
less  than  0.0001. 

In  order  to  study  variability  in  margin  definition  between  different  human  observers, 
113  images  representing  55  cases  were  outlined  also  by  a  second  radiologist  and  a  medical 
physicist.  Figure  5  shows  the  overlap  results  for  these  113  images.  Table  2  gives  the  p- 
values  resulting  from  paired  t-tests  for  means  on  the  overlap  measures  from  the  various 
segmentation  methods.  For  example,  the  null  hypothesis  of  the  first  row  is  that  the  overlap 
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Table  2:  The  p- values  resulting  from  paired  t- tests  for  means  on  the  overlap  measures  from 
the  various  segmentation  methods. 


Segmentation  Method 

Mean 

Segmentation  Method 

Mean 

p-value 

Second  Radiologist 

0.74 

Medial  Physicist 

0.76 

0.18 

Partially-Automatic 

0.71 

Fully- Automatic 

0.68 

0.014 

Partially- Automatic 

0.71 

Second  Radiologist 

0.74 

0.18 

Partially-Automatic 

0.71 

Medical  Physicist 

0.76 

0.016 

Fully-Automatic 

0.68 

Second  Radiologist 

0.74 

0.0041 

Pully-Automatic 

0.68 

Medical  Physicist 

0.76 

0.0003 

sample  means  of  both  data  sets  (the  overlap  for  the  113  images  from  the  second  radiologist 
and  those  from  the  medical  physicist)  are  equal.  A  p-value  less  than  0.05  is  commonly 
used  as  the  cutoff  indicating  a  statistically  significant  difference.  The  overlap  of  the  medical 
physicist  and  the  second  radiologist  are  similar  (p-value  >  0.05,  indicating  a  failure  to  show  a 
statistically  significant  difference),  showing  that  the  variability  in  margin  definition  between 
two  radiologists  is  similar  to  the  variability  between  a  radiologist  and  a  medical  physicist. 
This  provides  some  justification  for  using  the  medical  physicist’s  outlines  for  part  of  the 
database.  In  general,  the  overlap  of  the  lesions  defined  by  another  human  observer  is  similar 
to  the  overlap  of  the  lesions  defined  by  partially-automatic  segmentation  and  slightly  better 
than  the  overlap  of  lesions  defined  by  fully-automatic  segmentation. 


3.2  Segmentation  and  Computer-Aided  Diagnosis 

It  is  important  to  consider  how  changes  in  segmentation  affect  the  performance  of  individual 
computer-extracted  features  in  the  task  of  differentiating  benign  and  malignant  lesions.  The 
Az  values  for  the  various  computer-extracted  features  and  for  the  LDA  consistency  and  round 
robin  runs  are  given  in  Table  3.  In  Table  4  are  shown  the  p- values  associated  with  univariate 
z-score  tests  of  the  differences  in  the  Az  and  partial  Az  values  for  each  individual  feature, 
as  well  as  for  the  LDA  consistency  and  round  robin  runs.  The  null  hypothesis  assumes  that 
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the  data  in  different  columns  arose  from  binomial  ROC  curves  having  equal  values. 

For  the  individual  computer-extracted  features,  the  Az  values  ranged  from  0.70  to  0.86  in 
distinguishing  benign  from  malignant  lesions.  The  depth-to-width  ratio  (DWR)  performed 
best  in  this  study,  with  an  Az  of  0.84  for  the  manual  segmentation,  0.86  for  the  partially- 
automatic  segmentation,  and  0.82  for  the  fully-automatic  segmentation.  At  a  significance 
level  of  0.05,  we  see  a  statistically  significant  difference  in  the  Az  value  of  the  depth-to- 
width  feature  when  changing  the  segmentation  from  manual  to  partially-automatic  (from 
Az  =  .84  to  Az  =  .86)  or  from  partially-automatic  to  fully-automatic  (from  Az  —  .86  to 
Az  =  .82).  The  correlation  feature  performs  well  in  the  manual  segmentation  case,  with  an 
Az  of  0.81.  The  performance  in  the  partially  and  fully-automatic  cases  is  not  as  high,  with 
Az  values  of  0.74  and  0.70,  respectively.  This  difference  in  performance  of  the  correlation 
feature  when  changing  the  segmentation  from  manual  to  either  partially-automatic  or  fully- 
automatic  is  statistically  significant.  Both  the  normalized  radial  gradient  and  the  maximum 
side  difference  perform  similarly  with  all  three  segmentation  methods.  This  indicates  that, 
for  our  database,  these  features  are  robust  to  small  changes  in  segmentation. 

Table  3;  Performance  in  terms  of  Az  and  o.9-^z  values  of  individual  computer-extracted  fea¬ 
tures  as  well  as  the  LDA  for  manual,  partially-automatic  and  fully-automatic  segmentation. 


Analysis 

Manual 

Partially 

-Automatic 

Fully-Automatic 

Az 

0.9-^^ 

0.9-^^ 

1 

A, 

o.oAz 

Depth-to-Width  Ratio 

0.41 

Autocorrelation  Based 

0.42  ' 

■n 

mmM 

■■ 

0.22 

Normalized  Radial  Gradient 

0.76 

0.27 

0.76 

0.75 

0.34 

Maximum  Side  Difference 

!  0.73 

0.20 

"0.73 

0.23 

0.74 

0.24 

LDA  Consistency 

0.91 

0.63 

0.90 

0.61 

0.88 

0.53 

LDA  Round  Robin 

0.91 

0.61 

0.89 

0.58 

0.87 

0.51 

For  the  LDA  consistency  and  round  robin  runs,  the  change  in  the  Az  and  partial  Az 
are  statistically  significant  when  changing  from  either  manual  segmentation  or  partially- 
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Table  4:  P-values  for  Table  3. 


Manual  vs. 

Manual  vs. 

Partially- Automatic  vs. 

Analysis 

1 

Partially- Automatic 

Fully-Automatic 

Fully  Automatic 

A. 

0.9^2 

0.9^2 

A, 

0.9'^2 

Depth-to-Width  Ratio 

0.006 

0.035 

0.47 

0.036 

Autocorrelation  Based 

<0.0001 

0.01 

<0.0001 

BOi 

0.87 

Normalized  Radial  Gradient 

0.80 

0.39 

0.43 

0.042 

0.11 

Maximum  Side  Difference 

0.49 

0.92 

0.70 

0.61 

0.99 

0.99 

LDA  Consistency 

0.21 

0.14 

0.0017 

0.0045 

.0053 

.033 

LDA  Round  Robin 

0.21 

0.18 

0.0034 

0.0066 

.0094 

.030 

automatic  segmentation  to  fully-automatic  segmentation.  However,  we  fail  to  demonstrate 
a  statistical  difference  between  the  manual  and  partially-automatic  cases  indicating  that, 
given  good  estimates  of  the  lesion  width  and  height,  our  segmentation  algorithm  performs 
as  well  as  manual  segmentation  in  conjunction  with  our  automatic  classifier.  Figure  6  shows 
the  performance  of  the  LDA  in  terms  of  ROC  curves  for  each  type  of  segmentation  in  the 
task  of  distinguishing  malignant  from  benign  lesions. 

4  Summary 

We  have  developed  and  tested  a  segmentation  method  for  breast  lesions  on  ultrasound.  One 
of  the  advantages  of  the  method  is  that  it  tends  to  produce  margins  that  are  “lesion-like”. 
On  the  other  hand,  the  segmented  margins  delineate  the  general  shape  of  the  lesions  and  may 
not  depict  margin  details  such  as  spiculation  or  high  irregularity  [8].  However,  segmentation 
of  the  general  lesion  shape  appears  sufficient  for  the  features  chosen  in  our  experiment,  as 
indicated  by  the  performance  of  our  classifier  on  the  lesions  segmented  with  the  partially- 
automatic  method. 

In  conclusion,  our  automatic  classifier  yielded  values  of  0.91  and  0.87  in  distinguishing 
malignant  from  benign  lesions  when  using  fully-automatic  segmention  and  manual  segmen- 
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Figure  6:  Performance  of  discriminant  scores  in  distinguishing  malignant  from  benign  lesions 
for  manual,  partially- automatic  and  fully- automatic  segmentations.  Results  are  from  round- 
robin  analyses. 

tation,  respectively.  Our  results  indicate  that  when  used  in  conjunction  with  our  automatic 
classifier,  our  automatic  segmentation  algorithm  of  breast  lesions  on  ultrasound  performs 
similarly  to  manual  segmentation. 
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5  Appendix 

To  quantify  the  margin,  we  consider  the  normalized  radial  gradient  (NRG)  [12,  13],  which  is 
a  measure  of  the  average  orientation  of  the  gray  level  gradients  along  the  margin.  It  is  given 


NRG  = 


■  r(P) 


i:«rl|V/(P)ll 


where  I  is  the  image,  P  —  {x,  y)  is  the  pixel  location,  T  is  the  discretized  lesion  margin,  f  (P) 
is  the  unit  vector  in  the  radial  direction  from  the  geometric  center  of  the  lesion  to  the  point 
P,  and  •  is  the  dot  product  between  vectors.  For  ultrasound,  the  NRG  is  bound  between 
zero  and  one. 

The  posterior  acoustic  behavior  is  quantified  by  comparing  the  gray-level  values  posterior 
to  the  lesion  to  those  in  adjacent  tissue  at  the  same  depth.  Define  Ap  as  the  average  gray- 
level  of  a  region  of  interest  (ROI)  posterior  to  the  lesion.  Similarly,  let  Ai  be  the  average 
gray-level  of  the  ROI  to  the  left  of  the  lesion  at  the  same  depth  and  Ar,  the  average  gray-level 
of  the  ROI  too  the  right  of  the  lesion  at  the  same  depth.  Then  the  minimum  side  difference 


(MSD)  is 


MSD  —  min  {Ap  —  Ai  ,Ap  —  • 


The  shape  feature  that  we  consider  is  the  depth-to- width  ratio  of  the  lesion  [3].  Let  j{i)  = 
(71  (i),  72(2))  be  a  discrete  parameterization  of  the  margin  with  71  and  72  the  coordinates  in 
the  lateral  and  depth  directions  respectively.  Then 


DWR  = 


Depth  _  maxi(72(z))  — !minj(72(0) 
Width  max,  (71  (i))  —  minj(7i(z))  ' 


To  quantify  texture,  the  autocorrelation  in  depth  of  the  minimal  rectangular  ROI  R 
containing  the  lesion  is  used  to  define 


COR  =  T  , 

„  C',(0) 


(15) 
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•V 

where  the  autocorrelation  in  depth  and  its  sum  in  the  lateral  direction  are 

Cy{m,n)  =  {m,n  +  p)R‘^{m,p) , 

V 

Cy{n)  =  Y^Cy{m,n). 

m 
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Contrast-enhanced  magnetic  resonance  imaging  (MRI)  of  the  breast  is  known  to  reveal  breast 
cancer  with  higher  sensitivity  than  mammography  alone.  The  specificity  is,  however,  compromised 
by  the  observation  that  several  benign  masses  take  up  contrast  agent  in  addition  to  malignant 
lesions.  The  aim  of  this  study  is  to  increase  the  objectivity  of  breast  cancer  diagnosis  in  contrast- 
enhanced  MRI  by  developing  automated  methods  for  computer-aided  diagnosis.  Our  database 
consists  of  27  MR  studies  from  27  patients.  In  each  study,  at  least  four  MR  series  of  both  breasts 
are  obtained  using  FLASH  three-dimensional  (3D)  acquisition  at  90  s  time  intervals  after  injection 
of  Gadopentetate  dimeglumine  (Gd-DTPA)  contrast  agent.  Each  series  consists  of  64  coronal  slices 
with  a  typical  thickness  of  2  mm,  and  a  pixel  size  of  1.25  mm.  The  study  contains  13  benign  and 
15  malignant  lesions  from  which  features  are  automatically  extracted  in  3D.  These  features  include 
margin  descriptors  and  radial  gradient  analysis  as  a  function  of  time  and  space.  Stepwise  multiple 
regression  is  employed  to  obtain  an  effective  sub.set  of  combined  features.  A  final  estimate  of 
likelihood  of  malignancy  is  determined  by  linear  discriminant  analysis,  and  the  performance  of 
classification  by  round-robin  te.sting  and  receiver  operating  characteristics  (ROC)  analysis.  To 
assess  the  efficacy  of  3D  analysis,  the  study  is  repeated  in  two-dimensions  (2D)  using  a  represen¬ 
tative  slice  through  the  middle  of  the  lesion.  In  2D  and  in  3D,  radial  gradient  analysis  and  analysis 
of  margin  sharpness  were  found  to  be  an  effective  combination  to  distinguish  between  benign  and 
malignant  masses  (resulting  area  under  the  ROC  curve:  0.96).  Feature  analysis  in  3D  was  found  to 
result  in  higher  performance  of  lesion  characterization  than  2D  feature  analysis  for  the  majority  of 
single  and  combined  features.  In  conclusion,  automated  feature  extraction  and  classification  has  the 
potential  to  complement  the  interpretation  of  radiologists  in  an  objective,  consistent,  and  accurate 
way.  ©  1998  American  Association  of  Physicists  in  Medicine.  [80094-2405(98)01509-0] 

Key  words:  breast  imaging,  magnetic  resonance  imaging  (MRI),  computer-aided  diagnosis,  ROC 
analysis,  contrast  agent 


I.  INTRODUCTION 

Breast  cancer  is  a  major  cause  of  death  among  women  in 
most  western  countries.  Although  mammography  has  dem¬ 
onstrated  to  be  the  most  efficient  tool  for  early  detection  of 
breast  cancer,  the  technique  may  result  in  a  missed  fraction 
of  cancers  as  high  as  9%.'  In  addition,  the  fraction  of  lesions 
found  by  mammography  that  is  sent  to  biopsy  and  proves  to 
be  malignant  can  be  as  low  as  10%-20%.^  Accurate  exami¬ 
nation  of  mammograms  is  particularly  difficult  in  dense 
breasts,  because  lesions  may  be  occluded  by  dense  tissue. 
Consequently,  complementary  information  by  ultrasound  or 
biopsy  is  often  obtained. 

Magnetic  resonance  imaging  (MRI)  is  a  promi.sing 
complementary  technique  to  mammography  because  of  its 
inherent  three-dimensional  (3D)  nature.  In  addition  to  pos¬ 
sible  improvement  of  diagnostic  accuracy  from  dense 
breasts,  MRI  has  shown  superior  potential  for  quantification 
of  tumor  volume,  and  detection  of  multifocal  and  multicen¬ 
tric  disease.^’'*  These  issues  are  of  interest  in  the  consider¬ 


ation  of  breast-conserving  therapy.  From  current  consensus, 
MR  is  particularly  suited  for  specific  problem  cases,  such  as 
patients  who  have  high  risk  of  developing  breast  cancer,  pa¬ 
tients  with  implants,  postoperative  scars,  or  clinical  evidence 
of  breast  cancer  that  cannot  be  detected  by  conventional  di¬ 
agnostic  methods.'*^® 

MRI  has  become  practically  useful  for  breast  imaging 
since  the  introduction  of  contrast  agents  that  alter  the  spin- 
lattice  (Ti)  relaxation  time.^’^  Due  to  increased  vascularity 
and  capillary  permeability  of  tumors,®  contrast-enhanced 
MRI  shows  better  distinction  between  lesions  and  normal 
tissue  than  conventional  MRI  alone.  Nonetheless,  contrast- 
enhanced  MRI  is  known  to  enhance  both  malignant  as  well 
as  some  benign  types  of  masses,  thus  compromising  the 
specificity  of  the  technique.  In  general,  the  sensitivity  re¬ 
ported  for  diagnosis  of  breast  cancer  in  MR  images  is  larger 
than  90%,’®  but  the  reported  specificity  varies  considerably 
and  may  be  substantially  lower.®  '®  The  majority  of  these 
studies  are  solely  based  on  enhancement  as  a  function  of 
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time.  To  improve  the  specificity  without  reducing  the  sensi¬ 
tivity,  the  morphology  of  enhancement  has  been  studied  as 
well."  ''  Most  of  these  techniques  are,  however,  based  on 
slice  by  slice  assessment  of  the  morphology  in  2D.  The  per¬ 
formance  is  likely  to  improve  when  one  takes  full  advantage 
of  the  3D  nature  of  the  MR  data. 

An  important  aspect  that  may  contribute  to  varying  speei- 
ficity  is  interobserver  variation  in  the  interpretation  of  the 
MR  images.  Nearly  all  studies  presented  to  date  are  based  on 
visual  assessment  by  one  or  multiple  radiologists.  Mus- 
surakis  et  alP  report  significant  variability  in  the  as.sessment 
of  lesions  in  MR  data  by  human  readers  and  stress  the  im¬ 
portance  of  standardized  terminology.  Heywang-Kbbrunner 
et  Indicate  that  differences  in  interpretation  guidelines 
will  influence  the  accuracy  of  contrast-enhanced  MRI.  At¬ 
tempts  to  increase  the  objectivity  of  the  interpretation  have 
recently  been  reported  using  quantitative  rating  of  features 
such  as  spiculation  by  a  radiologist,  followed  by  merging  of 
these  ratings  using  an  interpretation  model. In  this  scheme, 
the  classification  stage  is  objective,  but  the  rating  of  the  fea¬ 
tures  is  still  subjective  to  the  interpretation  of  the  radiologist. 

Automated  quantification  and  classification  of  features  to 
discriminate  between  benign  and  malignant  lesions  has  been 
pursued  in  other  diagnostic  areas  such  as  mammography  in 
the  context  of  computer-aided  diagnosis  (CAD).*'*  Several 
investigators  have  successfully  developed  methods  for  com¬ 
puterized  detection'”’''^  and  computerized  classification'*"^*’ 
for  ultimate  use  in  CAD  as  “second  readers”  for  radiolo¬ 
gists.  In  addition  to  objective  analysis,  computerized  analysis 
can  take  full  advantage  of  information  across  slices  in  3D 
data  sets  which  is  difficult  to  assess  visually  from  individual 
images. 

The  aim  of  this  study  is  to  increase  the  objectivity  of 
breast  cancer  diagnosis  in  contrast-enhanced  MRI.  This  aim 
is  pursued  by  automated  extraction  of  features  that  quantify 
spatial  properties  of  contrast  enhancement  in  3D,  and  by 
merging  different  features  into  an  estimate  of  malignancy 
using  automated  classification.  The  ultimate  objective  of  this 
feasibility  study  is  to  reduce  the  number  of  biopsies  of  be¬ 
nign  lesions  and  to  increase  the  sensitivity  for  cancer  cases. 

II.  MATERIAL  AND  METHODS 
A.  Image  and  patient  data 

The  images  in  this  study  were  obtained  using  fast  low- 
angle  shot  (FLASH)  3D  acquisition  at  field  strength  of  1 .0  T 
(Siemens  Impact,  Siemens,  Erlangen,  Germany).  The  acqui¬ 
sition  parameters  were:  7’/?=  14.0 ms,  7’£'  =  7.0ms,  and  flip 
angle  of  25°.  Fat  suppression  was  not  employed.  The  patients 
were  scanned  in  prone  position  using  a  standard  double¬ 
breast  coil.  In  total,  141  preoperative  MR  series  were  ac¬ 
quired  from  27  patients.  Each  series  contains  64  coronal 
slices  with  a  typical  field  of  view  of  32X  16  cm^.  Each  slice 
contains  256X256  pixels  of  1.25  X  1.25  mm^  and  has  a  typi¬ 
cal  thickness  of  2  mm.  There  are  no  gaps  in  between  the 
slices.  Gadopentetate  dimeglumine  (Gd-DTPA)  contrast 
agent  was  injected  intravenously  by  power  injection  after 
acquisition  of  the  precontrast  MR  .series.  At  least  four  series 


Fig.  I .  Example  of  contra.st  enhancement  in  a  malignant  lesion,  illustrated 
on  a  dynamic  series  of  a  single  MR  slice.  Note  the  irregular  “donut”  shape 
of  the  lesion  (arrow)  as  it  enhances  in  time  before  merging  with  the  back¬ 
ground. 


were  taken  per  patient  at  90  s  intervals.  Figure  1  shows  an 
example  of  a  dynamic  MR  sequence  on  a  single  slice 
through  a  malignant  lesion. 

The  database  in  this  study  contains  28  lesions:  13  benign 
and  15  malignant  masses.  Histology  in  27  out  of  28  lesions 
was  confirmed  by  open  excisional  biopsy,  one  case  was  be¬ 
nign  based  on  core  biopsy  and  four-year  follow  up.  The  dis¬ 
tribution  of  the  size  of  the  lesions  is  shown  in  Fig.  2.  The 
relative  po.sition  of  the  lesions  in  the  breast  varies,  some  are 
close  to  the  skin  and  near  the  chest  wall.  Benign  masses 
include  fibroadenoma  (6/13),  papilloma  (2/13),  and  benign 
mastopathy  (5/13).  Malignant  cases  include  papillary  (1/15), 


0-1  1-2  2-3  3-4  4-5  6-6  6-7  7-8  >10 

Volume  of  lesion  (cm’) 


Fig.  2.  Distribution  of  the  size  of  benign  and  malignant  le.sions  in  our 
database. 
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Fio.  3.  Visualization  of  lesion  shape,  size,  and  orientation  in  3D  with  respect 
to  reference  liindmarks.  The  MR  dattt  is  acquired  in  slices  in  corontil  orien¬ 
tation,  but  the  computerized  analysis  of  the  lesion  is  perforincd  in  all  three 
dimensions  simultaneously. 


tubular  (2/15),  and  medullary  carcinoma  (1/15),  invasive 
lobular  (3/15)  and  ductal  (7/15)  cancer,  and  ductal  carcinoma 
in  situ  (DCIS)  (1/15). 

B.  Feature  extraction 

In  this  preliminary  study,  suspect  masses  were  delineated 
by  a  radiologist  (U.B.)  experienced  in  MR-ma?nmography 
and  blinded  concerning  the  histological  diagnosis.  This  seg¬ 
mentation  was  performed  in  the  subtraction  images 
(postcontrast-prccontrast)  by  contouring  the  enhanced  tumor 
area  in  each  slice  that  intersected  the  lesion.  All  available 
subtraction  images  were  used  for  this  purpose.  As  an  addi¬ 
tional  reference,  the  radiologist  had  access  to  the  original 
(nonsubtracted)  MR  images  as  well.  All  other  stages  of  the 
scheme,  though,  as  described  below,  are  fully  automated. 

The  proposed  strategy  for  computerized  analysis  of  dy¬ 
namic  MR  data  in  3D  consists  of  two  consecutive  stages: 
Feature  extraction  and  classification.  The  feature  extraction 
stage  is  aimed  at  quantification  of  spatial  properties  of  en¬ 
hancement  in  suspicious  lesions.  Feature  extraction  has  two 
parts:  Extraction  of  the  breast  volume,  and  quantification  of 
spatial  properties.  Although  the  MR  data  is  obtained  in 
slices,  the  analysis  of  the  lesion  is  performed  in  3D,  taking 
all  directions  into  account  (Fig.  3).  The  volume  of  the  breast 
is  extracted  from  the  MR  data  by  global  segmentation  of 
pixel  values  at  a  threshold  that  maximizes  the  interclass  vari¬ 
ance  between  two  pixel-value  regions.^'  All  slices  in  the  data 
contribute  to  the  computation  of  a  single  threshold  value. 
The  result  of  the  segmentation  is  a  3D  binary  mask  in  which 
breast  voxels  are  labeled  with  value  “1,”  and  background 
voxels  with  value  “0.”  Remaining  gaps  in  the  mask  are 


removed  by  morphological  closing  operations:^^  Morpho¬ 
logical  erosion^^  is  employed  to  remove  an  empirically  es¬ 
tablished  margin  of  two  voxels  from  the  external  surface  of 
the  breast  mask.  This  step  is  required  to  avoid  strong  voxel- 
value  gradients  near  the  borders  of  the  breast  to  be  included 
in  the  computation  of  gradient-based  lesion  features.  Sub.se- 
quent  computation  of  features  in  the  original  MR  data  is 
restricted  to  voxels  that  have  value  “  1  ”  at  the  corresponding 
locations  in  the  breast  mask. 

Features  investigated  in  this  study  concern  the  inhomoge¬ 
neity  of  uptake  in  the  lesion  [Eqs.  (1)  and  (2)],  sharpness  of 
the  lesion  margin  [Eqs,  (3)  and  (4)],  analysis  of  the  shape  of 
the  lesion  [Eqs.  (5)  and  (6)],  and  radial  gradient  analysis  [Eq. 
(7)]. 

If  the  set  of  voxel  values  in  the  lesion  at  time  frame 
is  given  by  F/(r,/),  where  vectors  r  point  to  the  lesion,  and 
index  runs  from  frame  0  (i.e.,  the  frame  before  injection 
of  contrast)  to  Af  —  1  (where  M  is  the  total  number  of  time 
frames),  then  the  inhomogeneity  of  contra.st  uptake  is  char¬ 
acterized  by  two  features  which  are  defined  by 
[variancerF/(r,()] 

max  — ^ ^  , T.  ,  (1) 

,=0 . 

referred  to  here  as  variance  of  uptake 
and 

variancerF;(r,/) 
variancerF/(r,/  -I- 1 ) 

referred  to  here  as  change  in  variance  of  uptake, 

where  variancerF;(r,/)  denotes  the  computation  of  the  vari¬ 
ance  of  the  voxel  values  at  all  r  in  the  lesion  at  fixed  time 
frame 

The  sharpness  of  the  lesion  margins  is  characterized  by 
two  features  as  well.  The  first  feature  is  given  by 

mean,.||V[F,„(r,/)-F,„(r,0)](|| 

meanrF,„(r,/)  J’ 


max 

1=0 . w-i 


min 

i  =  0 . M-2 


referred  to  here  as  margin  gradient, 

where  V[F,„(r,/)  — F„,(r,0)]  denotes  the  set  of  voxel-value 
gradients  at  the  margin  of  the  suspect  lesion  in  the  difference 
images  of  time  frame  and  precontrast  frame  “0.”  Thus, 
the  sharpness  of  the  uptake  of  contrast  is  computed  at  the 
lesion  margin.  The  range  of  vectors  r  in  F,„  is  limited  to  a 
shell — three  voxels  thick — centered  on  the  surface  of  the  le¬ 
sion.  The  shell  is  employed  to  account  for  small  inaccuracies 
that  may  occur  in  the  delineation  of  the  lesion  outlines. 

The  second  feature  related  to  margin  sharpness  is  defined 
by 

variancerll  V  [  F„,  ( r,  Q  -  F,„  ( r,0)  ]  || 

[meanrF„,(r,/)]^ 

referred  to  here  as  variance  of  margin  gradient, 

and  is  only  computed  from  the  subtraction  frames  of  “i”  and 
“0”  where  the  margin  gradient  [Eq.  (3)]  is  maximum. 
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In  Eqs.  (3)  and  (4),  computation  of  the  spatial  gradient  is 
accomplished  in  3D  by  convolution  with  the  components  of 
a  3  X  3  X  3  Sobel  filter^^  in  three  orthogonal  directions.  Note 
that  this  approach  takes  information  on  lesion  margins  across 
slices  into  account. 

Circularity  of  the  shape  of  the  lesion  in  3D  is  given  by 

volume  of  lesion  within  sphere  of  effective  diameter 
volume  of  lesion 

(5) 

and  irregularity  in  3D  by 

TT- effective  diameter^ 

*  surface  of  lesion  ’ 

where  the  effective  diameter  is  defined  by 

3  ■  volume  of  lesion 
477 

The  volume  and  the  surface  of  the  lesion  are  estimated  from 
the  contours  of  the  segmented  masses.  For  this  purpose,  a  set 
of  binary  images  is  created  in  which  the  pixels  at  and  en¬ 
closed  by  the  contours  are  set  to  value  “1”  (object  pixels), 
and  remaining  pixels  to  value  “0”  (background  pixels).  The 
volume  of  the  lesion  is  determined  by  multiplying  the  num¬ 
ber  of  object  pixels  with  the  volume  of  one  voxel  in  world 
coordinates.  The  surface  of  the  lesion  is  computed  by  com¬ 
bining  the  set  of  2D  binary  images  into  a  3D  binary  repre¬ 
sentation  of  the  lesion.  Next,  the  faces  of  the  object  voxels 
that  are  exposed  to  background  voxels  in  the  3D  binary  vol¬ 
ume  are  identified  by  examining  the  value  of  the  neighboring 
voxels  in  the  x,  y,  and  z  directions.  The  face  of  an  object 
voxel  is  exposed  to  the  background  if  the  neighboring  voxel 
has  value  “0.”  The  surface  of  the  lesion  is  subsequently 
determined  by  calculating  the  sum  of  the  areas  of  the  faces 
exposed  to  the  background  in  world  coordinates.  Note  that 
the  circularity  and  irregularity  in  3D  are  computed  from  the 
volume  and  the  surface  of  the  lesion  in  world  coordinates — 
rather  than  in  voxel  coordinates — to  account  for  the  differ¬ 
ences  between  pixel  size  and  slice  thickness  (i.e.,  the  aniso¬ 
tropic  voxel  shapes). 

Radial  gradient  analysis  is  based  on  examination  of  the 
angles  between  voxel-value  gradients  and  lines  intersecting  a 
single  point  near  the  center  of  the  suspect  lesion  (i.e.,  lines  in 
radial  directions).  Radial  gradient  values  are  given  by  the  dot 
product  of  the  gradient  direction  and  the  radial  direction.  The 
histogram  of  radial  gradient  values — quantifying  the  fre¬ 
quency  of  occurrence  of  the  dot  products  in  a  given  region  of 
interest  (Fig.  4) — is  called  the  radial  gradient  histogram 
(RGH).  Analysis  of  the  RGH  yields 

max  {variance^  H  (p)},  (7) 

i  =  0,...,Ar-l  /)>0 

referred  to  here  as  the  variance  of  RGH  values. 

In  this  relationship,  H(p)  denotes  the  normalized  RGH,  and 
variable  p  is  given  by  the  normalized  dot  product 


IIR-GII 


IIR-GII 


Fig.  4.  The  radial  gradient  histogram  (RGH)  of  a  volume  of  interest  (VOI) 
with  a  benign  lesion  (a)  and  a  malignant  lesion  (b).  Shown  are  images  of 
representative  cross  sections  through  the  lesions.  The  radial  vector  (R)  origi¬ 
nates  in  the  center  of  the  VOI.  The  gradient  vector  (G)  indicates  the  local 
direction  of  the  voxel-value  gradient.  The  RGH  maps  the  dot  product  of  R 
and  G  against  the  frequency  of  occurrence  (RGH  values).  In  benign  lesions, 
R  and  G  tend  to  point  in  comparable  directions  within  the  VOI,  yielding  a 
peak  in  the  RGH  around  1 .0  (a).  Malignant  lesions  typically  extend  in  less 
spherical  patterns  resulting  in  a  flat  RGH  (h).  The  variance  of  RGH  values  is 
used  to  quantify  the  flatness  of  the  RGH, 


_  |V[Ffc(r,0-Ft(r,0)]-(r-rc)| 

^  II V [F(,( r, i )  - Fi,( r,0) 111  -  Hr-  r J  ’ 

where  V[Fi,(r,()-Ffe(r,0)]  indicates  the  set  of  voxel-value 
gradients  in  a  rectangular  box  of  interest  at  the  subtracted 
time  frames  “t”  and  “0.”  The  box  encompasses  the  suspect 
lesion  with  an  additional  margin  of  three  voxels  along  all 
sides.  Vector  r^.  points  to  the  center  of  this  rectangular  box. 
Thus,  Ff  generally  will  not  (and  does  not  need  to)  point  to  the 
exact  center  of  the  lesion.  This  aspect  will  be  reviewed  in 
more  detail  in  the  Discussion  section. 

In  essence,  above  equations  quantify  the  observation  that 
malignant  lesions  take  up  contrast  agent  in  a  less  homoge¬ 
neous  pattern  than  benign  masses,  have  less  sharp  boundaries 
and  are  more  irregularly  shaped.  “Circularity”  quantifies 
how  well  the  lesion  conforms  to  a  spherical  shape  [Fq.  (5)], 
and  “irregularity”  indicates  the  roughness  of  the  surface  of 
the  lesion  [Fq.  (6)].  Radial  gradient  analysis  was  previously 
applied  to  mammograms  to  quantify  spiculation  of  projected 
masses.*®  The  analysis  provides  a  measure  that  indicates  how 
well  the  image  structures  in  a  region  of  interest  (ROI)  extend 
in  a  radial  pattern  originating  from  the  center  of  the  ROI. 
Round  and  well-defined  masses  produce  different  measures 
than  irregular  and  spiculated  lesions.  In  the  current  study, 
radial  gradient  analysis  is  extended  to  3D.  The  feature 
“Variance  of  RGH  values”  quantifies  how  well  the  image 
structures  in  a  volume  of  interest  (VOI)  extend  in  a  spherical 
pattern  originating  from  the  center  of  the  VOI  (Fig.  4). 

With  the  exception  of  circularity  and  irregularity,  which 
are  computed  from  the  coordinates  of  the  segmented  lesions, 
all  other  features  are  extracted  from  the  data  at  each  available 
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Fig.  5.  Relationships  of  various  features  for  the  database  of  benign  and  malignant  lesions  (a)  margin  gradient  analysis  and  radial  gradient  analysis,  (b)  shape 
analysis. 


time  frame  and  combined  by  minimum  or  maximum  opera¬ 
tions,  such  as  described  by  Eqs.  (l)-(4)  and  (7). 

To  assess  the  efficacy  of  3D  analysis  in  comparison  with 
conventional  2D  techniques,  the  feature  extraction  and 
analysis  procedures  were  repeated,  although  in  2D,  on  a 
single  representative  slice  through  the  middle  of  the  lesions. 

C.  Feature  selection  and  classification 

From  the  total  set  of  seven  features,  stepwise  multiple 
regression^"*  produced  a  selection  that  performs  efficiently  in 
distinguishing  between  benign  and  malignant  lesions.  The 
technique  involves  adding  and  removing  features  to  obtain  a 
limited  subset  that  provides  statistically  significant  separa¬ 
tion  in  the  estimated  likelihood  of  malignancy.  Linear  dis¬ 
criminant  analysis^^  is  employed  to  estimate  this  likelihood 
of  malignancy  from  single  or  combined  features. 

D.  Evaluation 

The  performance  of  the  computerized  method  in  classifi¬ 
cation  (distinguishing  between  benign  and  malignant  lesions) 
is  quantified  by  receiver  operating  characteristics  (ROC) 
analysis.^^  In  particular  the  area  under  the  ROC  curve 
(Aj) — which  maps  the  fraction  of  false  positives  to  the  frac¬ 
tion  of  true  positives — is  used  as  a  measure  of  performance 
in  this  study.  Sensitivity  is  defined  as  the  true-positive  frac¬ 
tion,  specificity  as  one  minus  the  false-positive  fraction.  The 
area  under  the  ROC  curve  at  true-positive  fractions  larger 
than  0.9  (partial  A^)  is  employed  to  rate  the  performance  of 
computerized  analysis  at  high  sensitivity  levels.^^ 

The  general  performance  of  the  computerized  method  is 
estimated  by  round-robin  testing^^  on  our  current  database. 
This  “leave-one-out”  technique  involves  estimating  the 
likelihood  of  malignancy  from  all  cases  but  one,  testing  clas¬ 
sification  on  that  single  case,  and  repeating  the  procedure 
until  each  case  has  been  tested  individually. 


III.  RESULTS 

All  features  investigated  in  this  study  show  potential  for 
distinguishing  between  benign  and  malignant  lesions  (Fig.  5, 
Table  I).  As  expected,  benign  masses  were  found  to  extend 
more  along  spherical  patterns  than  malignant  lesions,  and  the 
margins  of  benign  masses  were  found  to  be  sharper  on  aver¬ 
age  than  the  margins  of  malignant  lesions.  An  interesting 
observation  is,  however,  that  the  variance  of  sharpness  along 
the  margin  of  the  lesions  is  larger  on  average  for  benign  than 
for  malignant  masses  [Fig.  5(a)].  A  possible  explanation  is 
offered  in  the  discussion  section.  Less  surprising  was  the 
result  that  some  malignant  lesions  tend  to  be  more  irregularly 
shaped  than  benign  masses  [Fig.  5(b)].  Circularity  was,  how¬ 
ever,  not  found  to  be  a  strong  feature  to  distinguish  between 


Table  I.  Area  under  the  ROC  curves  (Aj)  using  2D  and  3D  analysis  of 
individual  and  combined  features.  The  standard  deviations  (1  SD)  are  shown 
in  parentheses. 


Feature 

A,  (2D) 

A,  (3D) 

Inhomogeneity  of  uptake 

Variance  of  uptake 

Change  in  variance  of  uptake 

0.54  (0.11) 
0.59  (0.11) 

0.72  (0.11) 
0.77  (0.10) 

Sharpness 

Margin  gradient 

Variance  of  margin  gradient 

0.83  (0.07) 
0.71  (0.10) 

0.88  (0.07) 
0.86  (0.07) 

Shape 

Circularity 

Irregularity 

0.67  (0.10) 
0.66  (0.10) 

0.65  (0.10) 
0.80  (0.08) 

Radial  gradient  analysis 

Variance  of  RGH  values 

0.80  (0.08) 

0.88  (0.07) 

Combinations  of  features 

Variance  of  RGH  values  and 
margin  gradient 
variance  of  RGH  values  and 
variance  of  margin  gradient 

0.87  (0.11) 

0.86  (0.08) 

0.92  (0.05) 

0.96  (0.03) 
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False-Positive  Fraction 

Fig.  6.  ROC  ciirvc.s  .showing  the  performance  of  the  best  .single  features  in 
the  task  of  distinguishing  between  benign  and  malignant  lesions. 
(RGH  =  Radial  gradient  histogram.) 


False-Positive  Fraction 


Fig.  7.  ROC  curves  showing  the  performance  of  effective  combined  features 
in  the  task  of  distinguishing  between  benign  and  malignant  lesions  (using 
round-robin  testing).  (RGH=Radial  gradient  histogram.) 


benign  and  malignant  in  our  database  (Table  I).  Using  single 
features  only,  the  highest  performances  were  obtained  with 
radial  gradient  analysis  (y4-  =  0.88),  sharpness  (A.  =  0.88  and 
0.86),  and  .shape  analysis  of  the  lesion  (Aj  =  0.80).  The  cor¬ 
responding  ROC  curves  are  shown  in  Fig.  6.  For  operation  at 
sensitivities  larger  than  0.9,  the  highest  performance  was 
achieved  with  the  “margin  gradient”  feature,  yielding  a  par¬ 
tial  A  ,  value  of  0.70.  Note  that  although  the  “margin  gradi¬ 
ent”  and  “variance  of  margin  gradient”  have  comparable  A^ 
values  (Table  I),  their  ROC  curves  are  differently  shaped 
(Fig.  6).  The  curve  of  the  “margin  gradient”  feature  is 
steeper,  indicating  that  higher  specificity  can  be  achieved  at 
high  .sensitivity.  In  addition,  “variance  of  RGH  values”  and 
“margin  gradient”  have  comparable  A  ^  values,  although  the 
shapes  of  their  ROC  curves  differ.  The  “margin  gradient” 
feature  was  found  to  perform  better  at  higher  .sensitivity  than 
the  “variance  of  RGH  values”  (Fig.  6). 

Stepwise  multiple  regression  at  a  confidence  region  of 
95%  resulted  in  combination  of  two  features:  “Variance  of 
RGH  values”  [eq.  (7)]  and  “variance  of  margin  gradient” 
[eq.  (4)].  Their  combined  performance  in  distinguishing  be¬ 
tween  benign  and  malignant  lesions  resulted  in  an  A.  value 
of  0.96.  The  corresponding  ROC  curve  is  shown  in  Fig.  7. 
Some  combinations  that  have  more  than  two  features  yielded 
slightly  better  results,  but  this  increase  in  performance  was 
not  found  to  be  statistically  significant. 

Based  on  round-robin  testing  on  our  database,  the  com¬ 
puterized  scheme  achieved  11%  specificity  (10/13)  at  100% 
sensitivity.  Note  that  all  lesions  in  our  database  had  been 
biopsied.  In  other  words,  all  benign  lesions  were  basically 
misclassified  before  biopsy,  whereas  the  computerized 
method  misclassified  only  3  of  the  13  benign  lesions  without 
misclassifying  any  malignant  cases.  These  preliminary  re¬ 
sults  indicate  that  the  computerized  method  has  the  potential 
to  reduce  the  number  of  biopsies  of  benign  lesions. 


Although  statistical  significance  of  the  differences  in  per¬ 
formance  between  3D  and  2D  analysis  of  features  could  not 
be  ascertained  given  the  current  size  of  our  database,  a  con¬ 
sistent  superior  performance  using  3D  analysis  was  found  for 
nearly  all  single  and  all  combined  features  (Table  I). 

IV.  DISCUSSION 

Automated  extraction  of  mathematically  defined  features 
in  3D  yields  encouraging  results  in  distinguishing  benign 
from  malignant  lesions  (A  ^  =  0.96).  A^  values  of  classifica¬ 
tion  achieved  by  manual  rating  of  features  by  radiologists 
have  been  reported*^  to  be  around  0.86.  Direct  comparison 
with  results  reported  in  the  literature  is,  however,  difficult 
due  to  differences  in  database  used.  In  addition,  most  reports 
indicate  only  a  single  operating  point  for  sensitivity  and 
specificity  on  the  ROC  curve.  In  our  study,  the  operating 
point  must  be  tuned  to  a  desired  trade-off  between  sensitivity 
and  specificity.  This  trade-off  would  be  determined  clinically 
by  a  cost-benefit  analysis.  In  our  current  study,  a  sensitivity 
of  100%  can  be  obtained  at  specificity  of  77%.  At  this  oper¬ 
ating  point,  all  malignant  lesions,  including  one  case  of 
DCIS,  are  successfully  identified  as  malignant.  One  of  the 
next  steps  of  research  is  to  study  the  accuracy  of  breast  can¬ 
cer  diagnosis  done  by  radiologists  when  assisted  by  the  au¬ 
tomated  technique  as  a  second  opinion. 

We  found  that  the  margins  of  benign  lesions  are  typically 
sharper  in  appearance  than  the  margins  of  malignant  lesions, 
which  is  consistent  with  observations  from  other  studies. 
When  the  average  sharpness  of  the  lesion  margins  is  high, 
deviations  from  this  average  caused  by  anatomical  morphol¬ 
ogy,  partial  volume  effect,  or  inaccuracies  in  segmentation  of 
the  lesion,  will  result  in  higher  variance  values  than  would 
occur  with  a  small  average  sharpness  of  lesion  margins. 
Thus,  the  variation  of  sharpness  along  the  margin  is  expected 


Medical  Physics,  Vol.  25,  No.  9,  September  1998 


1653  Gilhuijs,  Giger,  and  Bick:  Computerized  analysis  of  breast  lesions 


1653 


to  be  larger  for  benign  than  for  malignant  cases.  Our  prelimi¬ 
nary  results  indicate  that  this  observation  yields  good  poten¬ 
tial*  to  discriminate  between  benign  and  malignant  lesions 
‘  [Fig.  5(a),  Table  I]. 

In  the  current  study,  we  found  that  spatial  features  are 
effective  to  distinguish  between  benign  and  malignant  le¬ 
sions,  in  particular  the  combination  of  radial  gradient  analy¬ 
sis  and  analysis  of  margin  sharpness.  Most  studies  of 
contrast-enhanced  MRl  of  the  breast  are  based  on  analysis  of 
temporal  features  of  uptake  only — such  as  speed  of  uptake — 
and  report  varying  specificity.  It  is  possible,  however,  that 
consistent  high  performance  can  be  obtained  from  temporal 
features  when  the  temporal  resolution  of  the  data  is  high. 
Boetes  et  al}^  report  encouraging  results  from  temporal  fea¬ 
tures  in  data  obtained  at  high  temporal  resolution  at  the  ex¬ 
pense  of  spatial  resolution.  Physiological  models  of  contrast 
uptake  and  washout  have  also  been  applied  to  increase  the 
specificity  of  the  diagnosis  from  temporal  features,  e.g.. 
Tofts  et  al?^  The  preliminary  results  from  our  study  indicate 
that  good  distinction  between  benign  and  malignant  lesions 
can  be  obtained  from  spatial  features  without  extremities  in 
temporal  or  spatial  resolution.  An  important  aspect  is,  how¬ 
ever,  3D  acquisition  and  analysis  of  the  spatial  features.  The 
results  from  the  current  study  indicate  that  it  is  beneficial  to 
analyze  spatial  features  in  3D  rather  than  in  2D  to  distinguish 
between  benign  and  malignant  lesions. 

In  addition  to  differences  in  image  acquisition,  other  as¬ 
pects  may  also  influence  the  specificity  of  the  diagnosis.  Dif¬ 
ferences  in  bolus  size  and  particular  hemodynamic  character¬ 
istics  of  each  patient  as  well  as  hormonal  factors,  may  cause 
variable  enhancement.'*’*’  Image  artifacts  can  be  caused  by 
inhomogeneity  of  the  magnetic  field  and  by  patient  motion. 
To  reduce  the  effect  of  some  of  these  aspects,  features  have 
been  normalized  within  or  across  time  frames  in  the  same 
examination.  Gradient  artifacts  caused  by  inhomogeneity  of 
the  magnetic  field  typically  occur  at  much  lower  spatial  fre¬ 
quency  than  the  lesion  margins,  and  were  found  to  be  of  less 
importance  in  this  study.  Patient  movement  is  estimated  to 
be  about  2  mm  on  average  in  our  data  set.  Because  the  voxel 
dimensions  are  1. 25  X  1.25X2.0  mm^,  the  motion  causes 
some  blurring  of  the  lesions  rather  than  an  actual  displace¬ 
ment  of  image  structures.  To  avoid  image  artifacts  due  to 
motion  of  the  heart,  the  MR  slices  were  obtained  in  coronal 
orientation.  Different  slice  thickness  may  also  result  in  dif¬ 
ferences  in  performance.  To  take  the  anisotropic  voxel 
shapes  in  the  MR  data  into  account,  features  related  to  the 
shape  of  the  lesion  are  calculated  in  world  coordinates,  rather 
than  in  voxel  coordinates.  Lesion  sharpness  is  calculated, 
however,  using  a  3X3X3  Sobel  filter,  which  does  not  ac¬ 
count  for  the  anisotropic  voxel  shape.  An  intuitive  approach 
would  be  to  resample  the  MR  voxels  into  a  uniform  coordi¬ 
nate  grid  by  linear  interpolation.  Such  linear  modeling  of  the 
discontinuity  at  the  edges  of  the  lesions  may,  however,  lead 
to  underestimation  of  sharp  lesion  margins,  thus  compromis¬ 
ing  the  benefit  of  the  correction.  In  addition,  the  approach 
does  not  take  other  effects  of  deviating  slice  thickness  into 
account,  such  as  differences  in  partial  volume  effect.  These 
aspects  are  topic  of  future  study. 


Automated  segmentation  of  the  lesions  is  likely  to  further 
improve  the  objectivity  of  diagnosis.  Techniques  for  this  pur¬ 
pose  have  recently  been  investigated,  e.g.,  Lucas-Quesada 
et  al.^^  and  will  remain  a  subject  of  future  research.  In  our 
study,  the  sharpness-related  features  are  computed  in  a  shell 
around  the  indicated  outline  of  the  tumor  to  account  for 
small  inaccuracies  in  the  segmentation.  Radial  gradient 
analysis  does  not  require  accurate  delineation  of  the  margins 
of  the  lesion:  The  region  of  interest  is  a  rectangular  box 
positioned  roughly  around  the  lesion  in  the  subtracted  im¬ 
ages.  Spurious  gradients  associated  with  background  noise 
are  expected  to  be  randomly  distributed  with  respect  to  the 
radial  directions  of  the  noise  voxels,  thus  adding  a  small 
constant  offset  to  the  RGH  values.  Consequently,  a  region  of 
interest  somewhat  larger  than  the  size  of  the  actual  lesion  is 
not  expected  to  have  much  effect  on  the  variance  of  the  RGH 
values.  The  radial  lines  intersect  the  center  of  the  rectangular 
bounding  box  positioned  around  the  lesion.  This  intersection 
point  will  usually  not  coincide  with  the  exact  center  of  the 
lesion.  Nevertheless,  since  the  shape  of  the  lesions  is  gener¬ 
ally  not  perfectly  symmetrical  and  regular,  the  center  of  the 
lesion  is  expected  to  vary  with  the  shape  of  the  lesions  in  a 
similar  way  as  the  center  of  the  rectangular  bounding  box  is 
expected  to  vary  with  the  shape  of  the  lesions.  Consequently, 
neither  definition  of  the  center  is  expected  to  yield  superior 
performance  compared  to  the  other. 

It  is  likely  that  the  ability  to  distinguish  between  benign 
and  malignant  lesions  will  decrease  for  smaller  tumor  sizes. 
The  smallest  lesion  in  our  database  has  a  volume  of  0.1  cm^ 
(benign  case)  and  was  correctly  classified  at  no  loss  of  ma¬ 
lignant  cases.  At  this  operating  point,  3  of  the  13  benign 
lesions  were  incorrectly  classified  as  malignant.  These  le¬ 
sions  had  volumes  of  0.2,  0.9,  and  3.4  cm^,  respectively 
(papilloma  and  benign  mastopathy).  Other  benign  lesions 
with  similar  histology  and  volumes  were,  however,  correctly 
classified,  as  well  as  all  malignant  lesions,  which  have  sizes 
ranging  from  0.1  cm^  to  larger  than  10  cm^  (Fig.  2).  In  con¬ 
clusion,  our  database  did  not  show  an  obvious  correlation 
between  accuracy  of  the  performance  of  the  computerized 
diagnosis  and  lesion  size,  nor  between  accuracy  and  histol¬ 
ogy.  Evaluation  of  the  computerized  analysis  technique  on 
larger  databases  is,  however,  required,  and  may  warrant  the 
use  of  more  advanced  classification  methods,  such  as  artifi¬ 
cial  neural  networks. 

Once  a  mass  is  suspected  to  be  malignant,  localization  for 
accurate  biopsy  is  a  next  step.  Techniques  for  MRI-directed 
biopsy  are  being  developed  and  evaluated.^^'^^  The  3D  nature 
of  the  MR  data  may  allow  useful  complementary  informa¬ 
tion  to  visualize  the  size  and  shape  of  the  lesion  as  well  as  its 
location  relative  to  the  nipple  position  and  pectoralis  muscle 
(Fig.  3). 

V.  CONCLUSIONS 

A  technique  aimed  at  computer-aided  diagnosis  of  suspect 
lesions  in  contrast-enhanced  MRI  of  the  breast  has  been  de¬ 
veloped  to  increase  the  objectivity  of  breast  cancer  diagnosis. 
Initial  results  of  analysis  of  spatial  features  in  3D  indicate 
good  accuracy  of  classification  (A^  =  0.96),  and  higher  per- 


Medical  Physics.  Vol.  25.  No.  9.  September  1998 


1654 


Gilhuijs,  Giger,  and  Bick;  Computerized  analysis  of  breast  lesions 


1654 


formance  than  analysis  in  2D  for  the  majority  of  single  and 
combined  features.  Consequently,  automated  extraction  of 
features  that  quantify  the  spatial  properties  of  contrast  en¬ 
hancement  has  the  potential  to  complement  the  interpretation 
of  radiologists  in  an  objective,  consistent  and  accurate  way. 
In  addition,  the  computerized  analysis  technique  shows  po¬ 
tential  to  reduce  the  fraction  of  biopsies  of  benign  lesions. 
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Rationale  and  Objectives.  Breast  sonography  is  not  rou¬ 
tinely  used  to  distinguish  benign  from  malignant  solid 
masses  because  of  considerable  overlap  in  their  sono¬ 
graphic  appearances.  The  purpose  of  this  study  was  to 
investigate  the  computerized  analyses  of  breast  lesions  in 
ultrasonographic  (US)  images  in  order  to  ultimately  aid  in 
the  task  of  discriminating  between  malignant  and  benign 
lesions. 

Materials  and  Methods.  Features  related  to  lesion  margin, 
shape,  homogeneity  (texture),  and  posterior  acoustic  at¬ 
tenuation  pattern  in  US  images  of  the  breast  were  extracted 
and  calculated.  The  study  database  contained  184  digitized 
US  images  from  58  patients  with  78  lesions.  Benign  lesions 
were  confirmed  at  biopsy  or  cyst  aspiration  or  with  image 
interpretation  alone;  malignant  lesions  were  confirmed  at 
biopsy.  Performance  of  the  various  individual  features  and 
output  from  linear  discriminant  analysis  in  distinguishing 
benign  from  malignant  lesions  was  studied  by  using  re¬ 
ceiver  operating  characteristic  (ROC)  analysis. 

Results.  At  ROC  analysis,  the  feature  characterizing  the 
margin  yielded  values  (area  under  the  ROC  curve)  of 
0.85  and  0.75  in  distinguishing  between  benign  and  malig¬ 
nant  lesions  for  the  entire  database  and  for  an  “equivocal” 
database,  respectively.  The  equivocal  database  contained 
lesions  that  had  been  proved  to  be  benign  or  malignant  at 
cyst  aspiration  or  biopsy.  Linear  discriminant  analysis 
round-robin  runs  yielded  values  of  0.94  and  0.87  in  dis¬ 
tinguishing  benign  from  malignant  lesions  for  the  entire 
database  and  for  the  equivocal  database,  respectively. 

Conclusion.  Computerized  analysis  of  US  images  has  the 
potential  to  increase  the  specificity  of  breast  sonography. 

Key  Words.  Artificial  intelligence;  breast  imaging;  com¬ 
puter-aided  diagnosis;  computer  vision;  differential  diag¬ 
nosis;  US  imaging. 


Breast  cancer  is  a  leading  cause  of  death  in  women,  causing 
an  estimated  44,000  deaths  per  year  (1).  Mammography 
is  the  most  effective  method  for  early  detection  of  breast 
cancer,  and  periodic  screening  of  asymptomatic  women 
reduces  the  mortality  rate  (2-4).  Many  breast  cancers  are 
detected,  and  these  patients  are  referred  for  biopsy  on  the 
basis  of  a  radiographically  observed  mass  lesion  or  cluster 
of  microcalcifications.  General  rules  for  the  differentia¬ 
tion  of  benign  from  malignant  mammographically  identi¬ 
fied  breast  lesions  exist  (5,6),  but  considerable  misclassi- 
fication  of  these  lesions  still  occurs.  On  average,  less  than 
30%  of  masses  referred  for  surgical  breast  biopsy  are  ac¬ 
tually  malignant  (7). 

Breast  sonography  is  an  important  adjunct  to  diagnos¬ 
tic  mammography,  and  it  is  typically  performed  on  pal¬ 
pable  and/or  mammographically  identified  masses  to  de¬ 
termine  their  cystic  or  solid  nature.  The  accuracy  rate  of 
ultrasonography  (US)  has  been  reported  to  be  96%-100% 
in  the  diagnosis  of  simple  benign  cysts  (8),  and  masses  so 
characterized  do  not  require  further  evaluation.  US  has 
not  been  used  for  screening  purposes,  however,  because 
of  relatively  high  false-negative  and  false-positive  rates. 
Even  so,  US  is  being  evaluated  as  a  potential  screening 
method  in  women  with  dense  breasts  (9).  Physicians  at 
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some  centers  are  successful  at  visually  distinguishing  be¬ 
nign  from  malignant  masses  by  using  US,  but  physicians 
at  most  facilities  are  unable  to  rely  on  breast  US  to  avoid 
biopsy  because  of  the  considerable  overlap  in  the  sono¬ 
graphic  appearances  of  these  masses. 

With  the  advent  of  modern,  high-frequency  transducers 
that  have  improved  spatial  and  contrast  resolution,  how¬ 
ever,  several  sonographic  features  have  emerged  as  po¬ 
tential  indicators  of  malignancy  and  others  as  potential 
indicators  of  benign  masses  (10,1 1).  Benign  features  in¬ 
clude  hyperechogenicity,  ellipsoid  shape,  mild  lobula¬ 
tion,  and  a  thin,  echogenic  pseudocapsule.  Malignant 
features  include  spiculation,  angular  margins,  marked 
hypoechogenicity,  posterior  acoustic  shadowing,  and 
a  depth-to-width  ratio  greater  than  0.8. 

Stavros  et  al  ( 1 2)  used  various  features  to  characterize 
masses  as  being  either  benign,  indeterminate,  or  malig¬ 
nant.  Their  classification  scheme  had  a  sensitivity  of 
98.4%  and  a  negative  predictive  value  of  99.5%.  The 
sonographic  evaluation  described  by  these  investigators, 
however,  is  much  more  extensive  and  complex  than  that 
usually  performed  at  most  breast-imaging  centers.  US 
is  a  notoriously  operator-dependent  modality,  and  until 
these  encouraging  results  are  corroborated  through  addi¬ 
tional  studies  by  other  investigators,  it  is  unclear,  how 
widely  applicable  or  reliable  such  sonographic  classifica¬ 
tion  schemes  truly  are. 

Computer-aided  techniques  have  been  applied  to  color 
Doppler  US  evaluation  of  breast  masses  with  promising 
results  (13).  Color  Doppler  imaging  is  a  technique  that 
focuses  on  the  vascularity  of  lesions.  Not  all  sonograph- 
ically  visible  cancers  have  demonstrable  neovascularity, 
however,  and  benign  lesions  can  be  vascular.  Therefore, 
the  sensitivity  and  specificity  of  this  technique  are  inher¬ 
ently  somewhat  limited.  These  limitations  have  been 
demonstrated  in  power  Doppler  imaging  of  solid  breast 
masses  (14). 

Comprehensive  summaries  of  investigations  regarding 
mammographic  computer-aided  diagnosis  have  been 
published  (15,16).  During  the  1960s  and  1970s,  several 
investigators  attempted  to  analyze  mammographic  abnor¬ 
malities  by  using  computers  (17-24).  These  investigators 
demonstrated  the  potential  capability  of  computers  in  the 
detection  of  mammographic  abnormalities.  Gale  et  al 
(17)  and  Getty  et  al  (18)  both  reported  on  computer- 
based  classifiers  that  take  diagnostically  relevant  features 
obtained  from  radiologists’  readings  of  breast  images 
as  input.  Getty  et  al  found  that  with  use  of  this  classifier, 
community  radiologists  performed  as  well  as  unaided 


expert  mammographers  in  differentiating  benign  from 
malignant  lesions.  In  addition,  Swett  and  Miller  (19) 
developed  an  expert  system  to  provide  both  visual  and 
cognitive  feedback  to  radiologists  by  using  a  critiquing 
approach  combined  with  an  expert  system.  At  the  Uni¬ 
versity  of  Chicago,  we  have  shown  that  computerized 
analysis  of  mass  lesions  (21,23)  and  clustered  micro¬ 
calcifications  (22,24)  as  shown  on  digitized  mammo¬ 
grams  yields  performance  rates  similar  to  those  of  expert 
mammographers  and  significantly  better  (P  <  .05)  than 
those  of  average  radiologists  in  distinguishing  malignant 
from  benign  lesions. 

US  is  a  digital  modality  that  is  amenable  to  application 
of  computer-aided  diagnosis  techniques  that  could  ulti¬ 
mately  be  used  in  a  real-time  fashion  (at  the  time  of  ex¬ 
amination)  to  improve  diagnostic  accuracy.  Given  that 
sonographic  interpretation  is  a  subjective  process,  how¬ 
ever,  and  that  criteria  have  been  developed  that  may  al¬ 
low  for  differentiation  of  benign  from  malignant  solid 
breast  masses,  it  is  reasonable  to  assume  that  computer- 
aided  diagnosis  techniques  applied  to  sonographic  im¬ 
ages  would  also  improve  radiologists’  performance,  par¬ 
ticularly  when  this  method  is  combined  with  correspond¬ 
ing  mammographic  data  (25).  Recently,  Garra  et  al  (26) 
showed  promising  results  with  the  use  of  computer-ex¬ 
tracted  features  derived  from  co-occurrence  matrices  of 
images  of  breast  lesions. 

In  this  study,  we  attempted  to  determine  if  computer 
analysis  of  breast  lesions  in  gray-scale,  US  images  could 
be  used  to  discriminate  malignant  from  benign  lesions. 


MATERIALS  AND  METHODS 


Masses  were  viewed  sonographically  by  filming  repre¬ 
sentative  images  in  orthogonal  planes.  The  US  examina¬ 
tions  were  performed  with  an  Ultramark  9  with  High 
Definition  Imaging  (HDI)  from  Advanced  Technology 
Laboratories  (Bothell,  Wash)  with  a  high-frequency,  7.5- 
MHz,  electronically  focused,  near-field  imaging  probe. 
The  static  images  of  lesions  that  did  not  contain  overlaid 
cursors  or  color  Doppler  signals  were  used  in  this  study. 

The  US  film  images  were  retrospectively  collected  and 
then  digitized  with  a  laser  film  scanner  (KFDR-S;  Konica, 
Tokyo,  Japan)  with  a  scanner  pixel  size  of  0.1  mm  and  10- 
bit  quantization.  Each  multiformat  film  contained  only  one 
US  image.  Film  digitization  is  not  the  optimal  approach  to 
acquiring  digital  US  data,  but  it  was  the  only  one  available 
for  this  initial  study. 
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COMPUTER  ANALYSIS  OF  BREAST  US  IMAGES 


The  58  patients  in  the  study  ranged  in  age  from  35 
to  89  years  (mean  age,  53  years).  They  had  78  masses  as 
shown  on  184  digitized  US  images.  Benign  lesions  were 
confirmed  at  biopsy  or  cyst  aspiration  or  with  image  in¬ 
terpretation  alone,  whereas  malignant  lesions  were  con¬ 
firmed  at  biopsy.  Of  the  total  184  images,  144  were  from 
43  patients  with  62  benign  lesions,  and  40  were  from  15 
patients  with  16  malignant  lesions.  Benign  lesions  in¬ 
cluded  simple  cysts,  complex  cysts,  and  solid  masses. 

Of  the  62  benign  lesions,  19  (all  solid)  were  proved  at 
biopsy,  five  (1  solid  lesion  and  four  complex  cysts)  were 
proved  at  cyst  aspiration,  and  38  (four  solid  lesions  and 
34  cysts)  were  deemed  to  be  benign  on  the  basis  of  visual 
interpretation  of  the  US  images  alone.  All  16  malignant 
lesions  were  proved  at  biopsy. 

Lesions  were  further  subcategorized  into  an  “equivo¬ 
cal”  category  on  the  basis  of  the  necessity  of  performing 
an  interventional  procedure  to  determine  their  status.  The 
24  benign  lesions  that  were  proved  at  biopsy  or  cyst  aspi¬ 
ration  and  the  16  malignant  lesions  made  up  the  equivocal 
database  (total,  40  lesions).  The  purpose  of  this  subcate¬ 
gorization  was  to  determine  the  ability  of  the  computer 
features  to  distinguish  benign  from  malignant  lesions  that 
required  an  interventional  procedure  (cyst  aspiration  or 
biopsy)  for  definitive  diagnosis. 

Manual  Lesion  Segmentation  and  Region-of- 
Interest  Seiection 

Once  digitized,  the  US  images  were  displayed  on  an 
IBM  monitor,  and  a  breast-imaging  radiologist  (D.E.W.) 
outlined  the  approximate  margins  of  each  lesion.  Figure  1 
shows  US  images  of  breast  lesions  with  outlined  margins. 
Regions  of  interest  (ROIs)  of  32  x  32  pixels  were  selected 
from  regions  within  and  around  the  lesions.  Features  were 
calculated  on  the  basis  of  the  manually  extracted  lesion 
margin  or  the  32  x  32-pixel  ROI. 

Automated  Feature  Extraction 

Four  types  of  lesion  characteristics  were  investigated: 
margin,  shape,  homogeneity  (texture),  and  posterior 
acoustic  attenuation.  Table  1  lists  these  characteristics 
and  their  relationships  to  benign  and  malignant  lesions. 

The  lesion  characteristics  were  quantified  by  using 
various  computer-extracted  features,  and  Table  2  lists  the 
computer-extracted  features  used  in  distinguishing  malig¬ 
nant  from  benign  lesions.  Features  were  calculated  either 
along  or  within  the  lesion  margin  or  within  the  32  x  32- 
pixel  ROI  (placed  within  the  central  portion  of  or  poste¬ 
rior  to  the  lesion). 


To  quantify  the  lesion  margin  characteristics,  a  gradient 
analysis  was  performed  along  a  computer-expanded  margin 
of  the  lesion.  In  this  analysis,  the  manually  extracted  margin 
was  first  expanded  by  using  morphologic  filtering.  Next, 
this  region  was  processed  by  using  a  Sobel  filter  to  obtain 
the  gradient  and  its  direction  at  each  pixel.  The  normalized 
radial  gradient  was  then  calculated  to  quantify  the  margin 
sharpness  and  degree  of  irregularity  (shape)  (21,27).  The 
normalized  radial  gradient  (21,27)  is  given  by  the  equation 

^  coscp^D^  +  Dj 

normalized  radial  gradient  =  . . , 

I 

Pe  margin 

where  is  the  gradient  along  the  x  axis,  is  the  gradient 
along  the  y  axis,  and  tp  is  the  angle  between  the  gradient 
vector  and  the  radial  gradient.  A  lower  value  for  the  nor¬ 
malized  radial  gradient  indicates  a  less  distinct  margin. 

The  geometric  measure  of  shape  in  terms  of  a  short- 
to-long  axis  ratio  for  each  lesion  was  determined  by  us¬ 
ing  the  image  data  along  the  margin.  Note  here  that  the 
short-to-long  axis  ratio  corresponds  to  a  depth-to-width 
ratio  to  extract  the  orientation  of  the  long  axis.  Cysts  tend 
to  be  ellipsoid,  thereby  resulting  in  a  depth-to-width  ratio 
of  much  less  than  1 ,  whereas  malignant  lesions  tend  to 
have  a  vertical  or  round  axial  orientation  (28). 

Texture  can  be  described  through  spatial  relationships 
between  image  pixels  by  using  changes  in  the  intensity 
patterns  and  gray  levels.  Texture  characteristics  of  the 
homogeneity  within  the  lesion  were  determined  by  using" 
a  measure  of  coarseness  (29).  The  texture  measure  of 
coarseness  (local  uniformity)  is  given  by  the  equation 

Gh  -1 

coarseness  =  [X  A'5'(0] 

i 

where  is  the  highest  gray-level  value  in  the  ROI  and  p. 
is  the  probability  of  occurrence  for  gray  level  i.  Thus,  if  N 
is  the  width  of  the  ROI  {N  =  32),  d  is  the  neighboring  size 
(half  the  operating  kernel  size  W),  the  ith  entry  of  ^  is 
given  by 

~  I  for  i  e  {Ni}ifNi  ^  0 
L  0  otherwise 
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Figure  1.  (a)  US  image  of  a  simple  cyst,  (b)  US  image  of  a  complex  cyst,  (c)  US  image  of  a  benign  solid  lesion,  (d)  US  image  of  a  ma¬ 
lignant  lesion,  along  with  radiologist-drawn  lesion  margin. 


Thus,  a  lower  value  of  coarseness  corresponds  to  a  finer 
visual  texture. 

The  computerized  assessment  of  posterior  acoustic 
attenuation  or  enhancement  associated  with  different  le¬ 
sions  was  determined  in  two  ways:  (a)  by  comparing  the 
gray-level  values  within  the  lesion  with  those  posterior 
to  that  lesion  and  (h)  by  comparing  the  gray-level  values 
posterior  to  the  lesion  with  those  in  adjacent  tissue  at  the 
same  depth.  These  calculations  were  performed  to  quantify 


where  {N.}  is  the  set  of  pixels  having  gray  level  /, 

Ai  =  i  /(■'■  +  P’  y  +  <?) 

W  1  zr  -d  p  r-.  -(I 

(/^  (!)  ^  (0,0)  to  exclude  (.v,y) 
iV  =  (2(1  +  I)-  (c!  =  3). 
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Table  1 

Lesion  Characteristics  and  Their  Relationship  to  Benign  and  Malignant  Lesions 

Characteristic 

Benign  (Cystic  or  Solid) 

Malignant 

Lesion  margin 

Lesion  shape 

Texture  within  lesion 

Posterior  acoustic  attenuation 

Smooth  borders 

Ellipsoid,  mildly  lobulated 

Anechoic,  hyperechoic,  reverberation  artifacts 
Posterior  enhancement 

Angular  margins,  spiculation 
Irregular;  depth-to-width  ratio,  >0.8 
Hypoechoic 

Posterior  shadowing 

the  amount  of  any  posterior  acoustic  shadowing  or  enhance¬ 
ment.  For  example,  benign  lesions  are  often  associated  with 
posterior  enhancement,  whereas  malignant  lesions  are  often 
associated  with  posterior  shadowing.  Simple  cysts  that  are 
anechoic  produce  less  attenuation  of  the  US  waves  than  sur¬ 
rounding  parenchyma  produces  and,  thereby,  cause  relative 
hyperechogenicity  posterior  to  the  lesion.  In  this  analysis, 
32  X  32-pixel  ROIs  were  placed  within  the  lesion,  po.sterior 
to  the  lesion,  and  in  the  adjacent  tissue  at  the  same  depth, 
and  the  differences  in  average  gray  levels  were  then  calcu¬ 
lated  to  quantify  the  posterior  acoustic  attenuation. 

The  feature  value  for  a  given  lesion  was  obtained  by  av¬ 
eraging  that  feature  value  over  all  views  of  the  lesion.  Each 
lesion  had  from  two  to  five  images  available  from  one  clini¬ 
cal  examination. 

Linear  discriminant  analysis  (LDA)  was  used  to  merge 
the  four  individual,  computer-extracted  features  into  a 
single  index  related  to  an  estimate  for  the  likelihood  of 
malignancy.  In  LDA,  the  discriminant  function  is  formu¬ 
lated  by  using  a  linear  combination  of  the  individual  fea¬ 
tures  (30).  Both  consistency  and  round-robin  runs  were 
performed.  In  round-robin  analysis,  the  discriminant 
function  is  trained  on  all  but  one  case  and  is  then  tested 
on  that  remaining  case;  this  process  is  repeated  until  all 
cases  have  been  individually  tested. 

Evaluation 

Receiver  operating  characteristic  (ROC)  analysis  (31) 
was  used  to  evaluate  (by  case,  not  by  image)  the  perfor¬ 
mance  of  the  individual  computer-extracted  features  in  dis¬ 
tinguishing  benign  from  malignant  lesions.  The  decision 
variable  for  the  ROC  analysis  was  each  individual  feature. 
The  area  under  the  ROC  curve  (A  J  was  used  as  an  indicator 
of  performance.  Specificity  at  high  sensitivity  is  relevant 
clinically,  because  the  cost  of  missing  a  cancer  is  greater 
than  the  cost  of  performing  an  interventional  procedure  for 
a  benign  lesion.  Therefore,  we  also  calculated  the  perfor¬ 
mance  of  the  features  in  the  high-sensitivity  range  (true¬ 
positive  fraction  [TPFj^],  >0.90)  by  using  the  partial  area 


index  (.^pp  A/),  which  is  the  portion  of  the  area  under  the 
ROC  curve  that  lies  above  the  true-positive  fraction  di¬ 
vided  by  the  constant  (1  -  TPF,,)  (32).  Both  A,  and  partial 
A.  values  were  calculated  for  the  entire  database  and  for 
the  equivocal  database. 


RESULTS 


The  A_  values  for  the  various  computer-extracted  US  fea¬ 
tures  ranged  from  0.54  to  0.85  in  distinguishing  benign 
from  malignant  lesions.  Table  2  provides  these  values  for 
both  the  entire  database  and  the  equivocal  database.  Be¬ 
cause  missing  a  cancer  is  more  important  clinically  than 
performing  an  interventional  procedure  for  a  benign  lesion, 
we  used  the  partial  area  index  to  quantify  performance 
of  the  features  at  a  high-sensitivity  level  (32).  Table  2 
provides  these  .,,pp  A values  for  the  entire  database  and 
the  equivocal  database. 

When  the  first  four  features  (listed  in  Table  2)  were  used, 
LDA  consistency  runs  yielded  A.  values  of  0.95  and  0.93  in 
distinguishing  benign  from  malignant  lesions  for  the  en¬ 
tire  database  and  the  equivocal  database,  respectively,  and 
the  round-robin  runs  yielded  A_  values  of  0.94  and  0.87  in 
distinguishing  benign  from  malignant  lesions  for  the  entire 
database  and  the  equivocal  database,  respectively  (Table  2). 
When  the  second  posterior  acoustic  attenuation  feature  was 
used,  LDA  consistency  runs  yielded  A_  values  of  0.94  and 
0.93  in  distinguishing  benign  from  malignant  lesions  for  the 
entire  database  and  the  equivocal  database,  respectively, 
and  the  round-robin  runs  yielded  A,  values  of  0.92  and  0.86 
in  distinguishing  benign  from  malignant  lesions  for  the  en¬ 
tire  database  and  the  equivocal  database,  respectively 
(Table  2). 


DISCUSSION 


Figure  2  shows  cluster  plots  of  the  coarseness  and  mar¬ 
gin  features  for  malignant  and  benign  lesions  in  the  entire 
database  and  the  equivocal  database.  Figure  3  shows  clus- 
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Table  2 

Computer-extracted  Features  Used  to  Quantify  Lesion  Characteristics 


Analysis 

Region  for 
Database 

Entire  Database* 

0.90^z' 

Equivocal  Databaset 

o.go'^z' 

Lesion  margin 

1.  Normalized  radial  gradient 

Margin 

0.85 

0.46 

0.75 

0.28 

Shape 

2.  Depth-to-width  ratio 

Margin 

0.67 

0.20 

0.75 

0.29 

Texture  within  the  lesion 

3.  Coarseness 

ROI 

0.54 

0.12 

0.67 

0.14 

Posterior  acoustic  attentuation 

4.  Difference  in  gray  level  between  “within  lesion” 
and  posterior  to  lesion 

ROIs 

0.77 

0.29 

0.72 

0.27 

5.  Difference  in  gray  level  between  “posterior  to 
lesion”  and  adjacent  tissue  at  same  depth 

ROIs 

0.83 

0.35 

0.72 

0.17 

Linear  discriminant  analysis  (features  1,  2,  3,  and  4) 
Consistency  analysis 

0.95 

0.78 

0.93 

0.70 

Round-robin  analysis 

0.94 

0.76 

0.87 

0.56 

Linear  discriminant  analysis  (features  1,  2,  3,  and  5) 
Consistency  analysis 

0.94 

0.68 

0.93 

0.58 

Round-robin  analysis 

0.92 

0.59 

0.86 

0.38 

Note. — Performance  is  given  in  terms  of  and  partial  ,4^ for  the  entire  database  and  the  “equivocal”  database  in  distinguishing  malig¬ 
nant  from  benign  lesions. 

*n  =  78  cases. 

M  =  40  cases. 
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Figure  2.  Cluster  plots  indicate  feature  values  for  margin  and  texture,  (a)  Values  for  the  entire  database,  (b)  Values  for  the  equivocal 
database. 


ter  plots  of  the  depth-to-width  ratio  and  the  (first)  posterior 
acoustic  attenuation  feature  for  malignant  and  benign  le¬ 
sions  in  the  entire  database  and  the  equivocal  database.  As 
these  figures  show,  malignant  lesions  tend  to  exhibit  less 


distinct  margins  and  more  posterior  shadowing  than  be¬ 
nign  lesions  as  documented  by  using  visual  US  criterion 
(10,12,26,28).  It  is  interesting  that  many  benign  lesions 
not  in  the  equivocal  database  of  this  study  had  a  very 
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“fine”  texture  (a  low  coarseness  feature),  because  many 
were  cysts  and  anechoic.  In  contrast,  the  benign  solid 
lesions  tended  to  have  a  coarse  texture. 

Figure  4  shows  performance  in  terms  of  ROC  curves 
for  the  features  characterizing  margin  and  acoustic  attenu¬ 
ation  in  distinguishing  malignant  from  benign  lesions  for 


the  entire  database  and  for  the  equivocal  database.  The 
ROC  curves  are  lower  for  the  equivocal  database  than  for 
the  entire  database,  thereby  indicating  that  as  for  radiolo¬ 
gists,  benign  lesions  in  the  equivocal  database  are  more 
difficult  to  distinguish  from  malignant  lesions.  Figure  5, 
which  shows  histograms  of  the  feature  values  (for  malig- 
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nant  lesion.s,  benign  lesions  in  the  equivocal  database,  and 
the  remaining  benign  lesions)  for  margin,  texture,  and  pos¬ 
terior  acoustic  attenuation,  further  illu.strates  this  point.  As 
shown,  malignant  lesions  tend  to  exhibit  a  coarse  texture, 
less  distinct  margins,  and  posterior  shadowing.  There  is, 
however,  substantial  overlap  in  the  features  of  malignant 
and  features  of  benign  lesion.s  in  the  equivocal  database. 

Table  2  and  Figure  6  show  perfomance  of  the  linear 
discriminant  function  in  distinguishing  malignant  from 
benign  lesions  for  the  entire  database  and  the  equivocal 
database.  Of  note  are  the  partial  A_  values  and  the  shape 
of  the  ROC  curves.  Extraction  from  the  fitted  ROC 
curves  for  the  equivocal  database  (in  which  all  lesions 
underwent  a  clinical  procedure  [cyst  aspiration  or  bi¬ 
opsy])  indicates  that  at  a  high  sensitivity  level  (90%) 
for  malignant  cases,  30%  of  the  benign  cases  were  classi¬ 
fied  as  benign  and,  thus,  could  potentially  have  avoided 
biopsy.  Therefore,  combined  use  of  the  four  computer- 
extracted  features  yields  superior  performance. 

Film  digitization  was  not  the  optimal  approach  to  acquir¬ 


Figure  5.  Histograms  of  the  computer-extracted  features  of  the 
benign  nonequivocal  database,  benign  equivocal  database,  and 
malignant  database,  (a)  Margin,  (b)  Texture,  (c)  Posterior  acous¬ 
tic  attenuation. 


ing  our  digital  US  database,  but  it  was  the  only  one  avail¬ 
able  for  this  study.  Even  with  the  limited  image  quality  ob¬ 
tained  from  digitization  of  the  multiformat  film  US  images, 
however,  we  could  observe  computer  features  capable  of 
distinguishing  malignant  from  benign  lesions.  Computer¬ 
ized  analysis  of  direct  digital  US  data  is  expected  to  im¬ 
prove  the  discriminatory  value  of  the  various  features,  e.spe- 
cially  for  texture  features  of  the  lesion  interior.  Our  future 
database  collection  will  include  direct  digital  data. 

The  purpose  of  this  study  was  to  determine  if  computer- 
extracted  features  on  US  images  of  breast  lesions  have  the 
potential  to  discriminate  malignant  from  benign  lesions  and, 
thus,  ultimately  help  to  reduce  the  number  of  unnecessary 
biopsies  performed.  This  potential  has  been  shown  even 
though  the  number  of  lesions  in  the  database  is  small.  In 
addition,  the  features  chosen  for  this  study  agree  well  with 
those  used  by  radiologists  when  interpreting  breast  US  im¬ 
ages.  It  should  be  noted,  however,  that  the  computer-ex¬ 
tracted  features  were  obtained  from  radiologist-drawn  mar¬ 
gins.  In  the  future,  the  subjectiveness  of  human-drawn  mar¬ 
gins  will  be  eliminated  with  use  of  computer  segmentation. 

US  images  of  the  breast  can  yield  information  on  the 
interior  of  the  lesion  (homogeneity),  as  well  as  on  the  in¬ 
terface  of  the  lesion  with  its  surroundings.  This  is  why 
US  is  used  to  distinguish  solid  from  cystic  lesions.  Gradi¬ 
ent  analysis  of  the  margin  yields  information  on  the  lesion 
margin,  including  its  sharpness  and  shape.  Geometric  fea¬ 
tures  relating  to  the  depth-to- width  ratio  of  the  lesion  are 
also  useful,  becau.se  even  though  many  solid  lesions  may 
be  ellipsoid,  the  orientation  of  the  ellipse  regarding  the 
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Figure  6.  Performance  of  discriminant  scores  in  distinguishing 
maiignant  from  benign  iesions  for  the  entire  database  and  the 
equivocai  database.  Results  are  from  round-robin  analyses. 

skin  is  important  in  distinguishing  benign  from  malignant 
lesions.  In  addition,  computerized  analysis  allows  for  an 
objective  assessment  of  posterior  acoustic  shadowing  and 
enhancement,  which  can  also  aid  in  distinguishing  benign 
from  malignant  lesions.  It  should  be  noted,  however,  that 
because  shadowing  depends  on  gain  settings,  scanning 
parameters  should  be  set  more  automatically  than  they 
are  at  present. 

Use  of  LDA  as  a  classifier  with  which  to  merge  the 
four  computer-extracted  features  was  useful  for  Improving 
performance  in  distinguishing  malignant  from  benign  le¬ 
sions  for  both  the  entire  database  and  the  equivocal  data¬ 
base.  As  the  overall  database  increases,  other  classifiers 
such  as  artificial  neural  networks  will  be  investigated  as 
means  with  which  to  merge  these  features  into  an  esti¬ 
mate  of  the  likelihood  of  malignancy. 

In  conclusion,  we  have  developed  methods  for  the 
computer  analysis  of  breast  lesions  in  US  images.  In  this 
study,  we  automatically  extracted  and  calculated  features 
related  to  lesion  margin,  shape,  texture  (homogeneity) 
within  the  lesion,  and  posterior  acoustic  attenuation. 
From  ROC  analyses  of  the  computer-extracted  features, 
the  features  based  on  margin  characteristics  yielded  A_ 
values  of  0.85  and  0.75  in  distinguishing  benign  from 
malignant  lesions  for  the  entire  database  and  the  equivo¬ 
cal  database,  respectively.  The  equivocal  database  con¬ 
sisted  of  lesions  that  had  been  proved  to  be  benign  or 


malignant  at  either  cyst  aspiration  or  biopsy.  LDA  con¬ 
sistency  runs  yielded  values  of  0.95  and  0.93  in  distin¬ 
guishing  benign  from  malignant  lesions  for  the  entire  data¬ 
base  and  the  equivocal  database,  respectively,  and  the 
round-robin  runs  yielded  A,  values  of  0.94  and  0.87  in 
distinguishing  benign  from  malignant  lesions  for  the  en¬ 
tire  database  and  the  equivocal  database,  respectively. 
Our  results  indicate  that  computerized  analysis  of  US  im¬ 
ages  has  the  potential  to  increase  the  specificity  of  breast 
sonography.  These  promising  results  warrant  further  de¬ 
velopment  and  testing  on  a  large,  direct-digital  database. 
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